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Introduction 


Banshee Graphics Engine is the second generation 3D graphics engine based on the original SST1 
architecture. Banshee incorporates all of the original SST1 features such as true-perspective texture 
mapping with advanced mipmapping and lighting, texture anti-aliasing, sub-pixel correction, gouraud 
shading, depth-buffering, alpha blending and dithering. In addition to the SST1 features, Banshee will 
include a VGA core, 2D graphics acceleration, and support for Intel’s AGP bus. 


Features 


SST1 baseline features 

SST1 software compatible 

AGP / PCI bus compliant 

Native VGA core 

2D acceleration 

Binary/Ternary operand raster ops 
Screen to Screen, Screen to Texture space, and Texture space to Screen Blits. 
Color space conversion YUV to RGB. 
1:N monochrome expansion 
Rendering support of 2048x2048 
Integrated DAC and PLLs. 

Bilinear video scaling 

Video in via feature connector 
Supports SGRAM memories 


Video-In: 


decimation 

support for interlaced video data 

support VMI, SAA7110 video connectors 
tripple buffers for video-in data 


Video-Out: 


Bilinear scaling zoom-in (from | to 10x magnification in increments of 0.25x) 
decimation for zoom-out (0.25x, 0.5x, 0.75x) 

chroma-keying for video underlying and overlaying 

support for stereoscopic display 

hardware cursor 

double buffer frame buffers for video refresh 

DDC support for monitor communication 

DPMS mode support 

overlay windows (for 3D and motion video) 
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Resolutions 


16/256K 
16/256K 
16/256K 
16/256K 
16/256K 
16/256K 
4/256K 
2/256K 
mono 
16/256K 
16/256K 
mono 
16/256K 
2/256K 
16/256K 
256/256K 


[MODE # |Mode Type _ |# of Colors Native Resolution [Alpha Format 


100 Graphics 256/256K 640x400 80x25 
101 Graphics 256/256K 640x480 80x30 


2; Functional Overview 


System Level Diagram 


In its entry configuration, a Banshee graphics solution consists of a single ASIC +RAM. Banshee is a 
PCI Slave device, that receives commands from the CPU via direct writes or through memory backed fifo 
writes. Banshee includes an entire VGA core, 2D graphic pipeline, 3D graphics engine, texture raster 
engine, and video display processor. Banshee supports all VGA modes plus a number of Vesa modes. 
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PCI System Bus 


Frame 


Buffer 4/8/16 Mbytes of Sgram 
Memory 


H3 
Graphics Chip 


Feature Connector 


Monitor 
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Architectural Overview 


Graphical Overview 


The diagram below illustrates the overall architecture of the Banshee graphics subsystem. 


| == PCI/AGP Interface si Interface 


wri Fifos [ase 


Feature 
Connector 
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2D 


LFB- LFB’ 
Host Port 


8x8x24 


Palette Endian 


Endian 


Write 
Buffer 


To Memory Ctrl 


Functional Overview 


Bus Support: Banshee implements both the PCI bus specification 2.1 and AGP specification 1.0 protocols. 
Banshee is a slave only device on PCI, and a master device on AGP. Banshee supports zero-wait-state 
transactions and burst transfers. 


PCI Bus Write Posting: Banshee uses a synchronous FIFO 32ntries deep which allows sufficient write 
posting capabilities for high performance. The FIFO is asynchronous to the graphics engine, thus 
allowing the memory interface to operate at maximum frequency regardless of the frequency of the PCI 
bus. Zero-wait-state writes are supported for maximum bus bandwidth. 


VGA: Banshee includes a 100% IBM PS/2 model 70 compatible VGA core, which is highly optimized for 
128 bit memory transfers. The VGA core supports PC 97 requirements for multiple adapter, and vga 
disable. 
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Memory Architecture: The frame buffer controller of Banshee has a 128-bit wide datapath with support for 
up to 100 MHz SGRAMs or SDRAMS. For 2D fills using the standard 2D bitBLT engine, 8 16-bit pixels 
are written per clock, resulting in a 800 Mpixel/sec peak fill rate. For screen clears using the color 
expansion capabilities specific to SGRAM, 64 bytes are written per clock, resulting in a 6.4 Gbytes/sec 
peak fill rate. The minimum amount of memory supported by Banshee is 4 Mbytes, with a maximum of 16 
Mbytes supported. 


Host Bus Addressing Schemes: Banshee occupies a combined 64 Mbytes of memory mapped address 
space, using two PCI memory base address pointers. Banshee also occupies 256 bytes of I/O mapped 
address space for video and initialization registers. The register space of Banshee occupies 6 Mbytes of 
address space, the linear frame buffer occupies 32 Mbytes of address space. 


2D Architecture: Banshee implements a full featured 128-bit 2D windows accelerator capable of 
displaying 8, 16, 24, and 32 bits-per-pixel screen formats. Banshee supports 1, 8, 16, 24, and 32 bits-per- 
pixel RGB source pixel maps for BitBlts. 4:2:2 and 4:1:1 YUV colorspace are supported as source 
bitmaps for host to screen BitBlts. Banshee supports screen-to-screen and host-to-screen stretch BitBlts at 
100 Mpixels/Sec. Banshee supports source and destination colorkeying, multiple clip windows, and full 
support of ternary ROP’s. Patterned Bresenham line drawing with full ROP support, along with polygon 
fills are supported in Banshee’s 2D core. Fast solid fills, pattern fills, and transparent monochrome 
bitmap BitBlts in 8 bits-per-pixel, 16 bits-per-pixel, and 32 bits-per-pixel modes. 
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3. Banshee Address Space 
MemoryBase0 


[Memory Address 


0x0200000 - 0xO7FFFFF 
0x0800000 - 0xOBFFFFF 
OxOCOO000 - OxOFFFFFF YUV planar space 


Memory Basel 


Memory Address CT 
0x0000000 - 0x 1 FFFFFF 
I/O Base0 


[WOAddress | 
| | Initialization registers 


OxOc - OxOf IfbMemoryConfig register 
0x10 - 0x13 
Ox14 - 0x17 


0x30 - 0x33 dramCommand register (see 2D offset 0x70 
0x34 - 0x37 dramData register (see 2D offset 0x064 
0x38 - 0x3b 


Oxle - OxIf 
eg te ell PLL and Dac registers 

0x40 - 0x43 pllCtrl0 

0x44 - 0x47 pllCtrl1 


0x48 - Ox4b plCtrl2 
Oxde - Ox4f 


| sd Video Registerspartt 
ebMaxDelta register 
vidProcCfg register. 


0x60 - 0x63 hwCurPatAddr register. 
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vidInFormat register 

vidInDecimInitErrs register. 

vidChromaMax register. 

vidScreenSize register. 

vidOverlayDudx register 

vidOverlayDvdy register. 

| Cd GARegisters 
ae Video Registers part IT 

vidDesktopOverlayStride register. 

vidInAddrO register 


VGA Address Space 


Ox03CE - 0x03CF 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
14 Printed 03/01/99 


is 
ar 


Voodoo Banshee Universal Access 2d Databook 


4. Basic Init Procedures 


Primary device 


When Banshee is a primary device, and it’s installed in a system capable of running x86 BIOS, the 
Banshee BIOS will perform the basic card init. This includes setting up the memory timings, clock 
settings, and other startup paramaters. 


Secondary device, or on non-x86 platforms 


If Banshee is running as a secondary device, the BIOS does not have a chance to setup the proper startup 
values. The proper values are stored in the banshee ROM. In order to setup the Banshee correctly, these 
values will need to be read out of the ROM installed on the card, and programmed in to the appropriate 
registers. 


On non-x86 systems, the expansion ROM will often already have a PCI Expansion ROM base address. 
However, most PC systems blank out the expansion rom address after shadowing the x86 part of the 
expansion ROM into system memory. If for some reason the expansion rom does not have a PCI base 
address, on Banshee, it is safe to borrow pci address space from the linear frame buffer. The address 
decoding hardware guarantees that if the expansion rom is mapped on top of the linear frame buffer, the 
expansion rom will have priority. In order to do this, set the PCI config space expansion rom base address 
to be the same as the frame buffer base address for that card, perform the necessary accesses to rom, and 
then set the expansion rom base address back to it’s previous value. Below is some example code to 
perform this address space borrowing and access the ROM. It makes use of standard pci access functions, 
and the external “mdc_crc_getbuffer()” which performs the CRC32 of the buffer. 


FxBool bansheeChecksumRom(LPCARDINFO card, FxU32 *checksum_Isthalf, FxU32 *romSize) { 
SstIORegs *ioregs = (SstIORegs *)(card->NatMem0.MappedAddr); 
FxU32 old_miscInit1; 
FxU32 base_addr,old_base_addr; 
FxU32 phys_addr; 
FxU32 mapped_addr; 
FxU32 size; 
FxU32 checksum_1; 
char *ptr; 


pciGetConfigData( PCILROM_BASE_ADDRESS, card->pciDevNum, &base_addr); 
old_base_addr = base_addr; 


/* we are going to borrow address space from the MMIO registers */ 
base_addr = card->PCIBase0 + 0x800000; 


phys_addr = base_addr & ~Ox7FF; 

if (joregs->reservedZ[1] & BIT(1)) { 
size = BANSHEE_ROM_ SIZE; /* 64k */ 
if (romSize) { *romSize = 64; } 
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} else { 
size = BANSHEE_ROM_SIZE/2; /* 32k */ 
if (romSize) { *romSize = 32; } 


} 


mapped_addr = card->NatMem0.MappedAddr + 0x800000; 
printf("old miscInitl = 0x%X\n",old_miscInit1 = ioregs->miscInit1); 
ioregs->miscInit] |= BIT(25); 


printf("banshee ROM base (0x%X,0x%X,size = %d)\n", 
base_addr,phys_addr,size); 


/* enable ROM address decoding */ 

base_addr &= ~0x1; 

pciSetConfigData( PCILROM_BASE_ADDRESS, card->pciDevNum, &base_addr); 
base_addr |= 1; 

pciSetConfigData( PCILROM_BASE_ADDRESS, card->pciDevNum, &base_addr); 


/* checksum the rom */ 
ptr = (char *)mapped_addr; 


if ((*(ptr) != 0x55) |] (*(ptr + 1) != OxAA)) { 
/* signature didn't match! */ 


/* disable ROM address decoding */ 
old_base_addr &= ~0x1; 
pciSetConfigData( PCILROM_BASE_ADDRESS, card->pciDevNum, &old_base_addr); 


printf("Couldn't find ROM signature: 0x55, OxAA\n"); 
ioregs->miscInit1 = old_miscInit1; 
return (FXFALSE); 


printf("ptr = 0x%X, size=0x% X\n" ptr, size); 


checksum_1 = mdc_getcrc_buffer(ptr,size/2); 
printf("checksum 1 = 0x%X\n",checksum_1); 


if (checksum_Isthalf) { 
*checksum_lIsthalf = checksum_1; 
} 
ioregs->miscInit1 = old_miscInit1; 
/* disable ROM address decoding */ 
old_base_addr &= ~0x1; 
pciSetConfigData( PCILROM_BASE_ADDRESS, card->pciDevNum, &old_base_addr); 


return FXTRUE; 
} 
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After you can access the expansion rom, you will need to find the important startup data. In the ROM, 
there is a table of configuration data which looks like this: 


struct tblOEMConfig_struct { 


UINT32 regPCIInit0; 
UINT32 regMiscInit0; 
UINT32 regMiscInit1; 
UINT32 regDRAMInit0; 
UINT32 regDRAMInit1; 
UINT32 regAGPInit0; 
UINT32 regPLLCtrl1; 
UINT32 regPLLCtrl2; 
UINT32 regSGRAMMode; 


}; 


/* TOBase[04h] */ 
/* TOBase[10h] */ 
/* TOBase[14h] */ 
/* TOBase[18h] */ 
/* TOBase[1Ch] */ 
/* TOBase[20h] */ 
/* TOBase[44h] */ 
/* TOBase[48h] */ 
/* JOBase[30h] and IOBase[10Dh] */ 


To find the OEM Config Table, you need to following these steps: 
1) At offset 50h of the x86 ROM Image, there is a word value. This value is the offset within the ROM 


image, of the ROM Config Table. 


2) The first WORD of the ROM Config Table is an offset within the ROM image, of the above OEM 


Config Table. 


3) If the pointer to the OEM Config Table is “0”, then there is no OEM Config Table. (This will not 
happen in any release versions of the Banshee ROM.) 


Setting up VESA desktop modes 


The easiest way to setup the VESA standard desktop modes is to make use of the VGA BIOS Parameter 
Tables. The contents of these tables have been included here for your convenience. The layout of these 
Parameter tables is standardized, and should be able in popular VGA Books. However, the layout is also 
included below for convinence. Keep in mind that the “byte number” in the table below is the location in 
the BIOS Parameter list, and is not related to the register location. You can find the register locations for 
the listed registers in any standard VGA book. Also keep in mind that in order to write the extended 
CRTC registers, you must unlock them by writing a 0 to bit 7 of CR11. 


Byte Contents 
Number 


Number of text columns 
2 Character height (in pixels) 


a 
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Overflow Register 


Vertical Total Register 
Preset Row Scan Register 


Cursor Start 


: 
: 
3 
mF 
mn 


Start Vertical Blanking Register 
End Vertical Blanking Register 
Mode Control Register 


Line Compare Register 


Attribute Controller Register Values: 


Palette Register 0 


Palette Register 1 


Palette Register 2 


Palette Register 3 


Palette Register 4 


Register 
Graphics Controller register values: 
Register 


Miscellaneous Register 


Below are the actual mode paramater tables for all the supported VESA standard modes. 


tblExtModeParms label byte 


See eS 


Ne Ne Ne Ne 


Mode 55h / VESA Mode 109h / Internal 


Mode 1Dh 


132x25 Color Text (8x16 font) - 40.0 MHz, 31.5 KHz, 70 Hz 


db 084h, 018h, 010h 

dw 02000h 

db O01h, 003h, 000h, 002h 
db O6Fh 

db O9Ah, 083h, 084h, O9Dh, 
db OBFh, O1Fh, 000h, O4Fh, 
db 000h, 000h, 000h, 000h, 
db O8Fh, 042h, O01Fh, 096h, 
db OFFh 

db 000h, OO1lh, 002h, 003h, 
db 014h, OO7h, O038h, 039h, 
db O03Ch, O3Dh, O3Eh, O3Fh, 
db OOFh, OO0Oh 

db 000h, O0O00h, OO00h, OO0O0h, 
db OOEh, OO00h, OFFh 


Mode 54h / VESA Mode 10Ah / Internal 
132x43 Color Text (8x9 font) - 


085h, 013h 
Q0Dh, OOEh 
09Ch, O8Eh 
OB9h, OA3h 


004h, OO5h 
O3Ah, O3Bh 
00Ch, O00Oh 
000h, 010h 


Mode 1Eh 
40.0 MHz, 31.5 KHz, 70 Hz 


db 084h, 02Ah, 009h 
dw 04000h 
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nad 

db O01lh, 003 
db QO6Fh 

db O9Ah, 083 
db OBFh, O1F 
db 000h, 000 
db O82h, 042 
db OFFh 

db 000h, 001 
db 014h, 007 
db 03Ch, 03D 
db OOFh, 000 
db 000h, 000 
db OOEh, 000 


Ne Ne Ne Ne 


db 084h, 
dw 04000h 
db 00O1h, 
db O06Fh 
db O9Eh, 
db OBFh, 
db 000h, 
db O8Fh, 
db OFFh 
db 000h, 
db 014h, 
db 03Ch, 
db OOFh, 
db 000h, 
db OOEh, 


Ne Ne Ne Ne 


db 084h, 
dw 04000h 
db OO1h, 
db OEFh 
db O9Eh, 
db OOBh, 
db 000h, 
db ODFh, 
db OFFh 
db 000h, 
db 014h, 
db 03Ch, 
db OOFh, 
db 000h, 
db OOEh, 


; 
IF BANSHEE _DOUBLESCAN 


Ne Ne Ne Ne Ne eS 


db 028 
dw OFFE 
db 001 
db O6F 
db 02D 
db OBF 
db 000 
db O8F 
db OFF 


Mode 78 / VESA Mode 
Mode 79 / VESA Mode 
Mode 7A / VESA Mode 
320x200 —- 256-color, 
25.175 MHz, 


001 
007 
03D 
000 
000 
000 


h, 000 


h, 084 
h, 000 
h, 000 
h, O1F 


h, 002 
h, 038 
h, O3E 
h 
h, 000 
h, OFF 


Mode 65h / VESA Mode 10Bh / 
132x50 Color Text (8x8 


h, 008 
h, 000 


h, 084 
h, 000 
h, 000 
h, O1F 


h, 002 
h, 038 
h, O3E 
h 
h, 000 
h, OFF 


Mode 64h / VESA Mode 10Ch / 
132x60 Color Text (8x8 


h, 008 
h, 000 


h, 084 
h, 000 
h, 000 
h, O1F 


h, 002 
h, 038 
h, O3E 


h, 000 
h, OFF 


h, O0O02h 


h, O9Dh, 
h, 048h, 
h, 0O0Oh, 
h, 089h, 


h, 003h, 
h, 039h, 
h, O3Fh, 


h, 000h, 


Internal 
font) - 


h, O0O02h 


h, O8ih, 
h, O47h, 
h, 0O0Oh, 
h, 096h, 


h, 003h, 
h, 039h, 
h, O3Fh, 


h, 000h, 


Internal 
font) - 


h, O0O02h 


h, O8lh, 
h, O47h, 
h, 0O0Oh, 
h, OE7h, 


h, 003h, 
h, 039h, 
h, O3Fh, 


h, 000h, 


085h, 
007h, 
092h, 
OB9h, 


004h, 
Q3Ah, 
00Ch, 


000h, 


PryYryNpy 


Mode 1Fh 


40.0 MHz, 


Q8Ah, 
006h, 
09Ch, 
OB9h, 


004h, 
Q3Ah, 
00Ch, 


000h, 


PryryDDyD 


Mode 20h 


40.0 MHz, 


Q8Ah, 
006h, 
QEAh, 
004h, 


004h, 
Q3Ah, 
00Ch, 


000h, 


PYyPpyNDy) 


180 / *Internal Mode 21h* 


10E / Internal Mode 25h 
F / Internal Mode 26h 
32K-color, 


10 


i earro: 


OOF 


027 
O1F 
000 
028 


KHz, 70 Hz 

h, 008h 

h, OO0Oh, OOEh 
h, 028h, 090h, 
h, OO0Oh, OCOh, 


h, OO0Oh, OO0O0h, 
h, O1Fh, 096h, 


16M-color 


029h, 
000h, 
09Ch, 
OB9h, 
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O8Fh 
000h 
08Eh 
OE3h 
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31.5 KHz, 


31.5 KHz, 


(8x8 font, 


70 Hz 


60 Hz 


40x25 


"Text") 


Revision 1.01 
Printed 03/01/99 


Voodoo Banshee Universal Access 2d Databook 


nad 
db 000h, OO1h, 
db 006h, OO7h, 
db 00Ch, OODh, 
db OOFh, 0O00h 
db 000h, 000h, 
db 005h, OOFh, 


320x240 - 256-color, 


Ne Ne Ne Ne Ne Ne Ne 


db 028h, 0O1Dh, 
dw OFFFFh 

db OOlh, OOFh, 
db OEFh 

db Q02Dh, O027h, 
db QOBh, O3Eh, 
db 000h, 0O00h, 
db ODFh, 028h, 
db OFFh 

db 000h, OO1h, 
db 006h, OO7h, 
db 00Ch, OODh, 
db OOFh, 000h 
db 000h, 0O00h, 
db 005h, OOFh, 


400x300 - 256-color, 


See eee Ty 


002h, 
008h, 
00Eh, 


000h, 
OFFh 


003h, 
009h, 
0OFh, 


000h, 


32K-color, 


25,175 MHz, 31.5 KHz, 60 


008h 
000h, 


028h, 
000h, 
000h, 
O1Fh, 


002h, 
008h, 
QOEh, 


000h, 
OFFh 


Hz 


OOE 


090 
O0cO 
000 
OE7 


003 
009 
OOF 


000 


32K-color, 


40.000/2 MHz, 35.5 KHz, 60 Hz 
db 032h, O024h, OO08h 

dw OFFFFh 

db OOlh, OOFh, O000h, OOEh 
db O2Fh 

db O3Dh, O31h, 032h, 080h, 
db 072h, OFOh, 000h, 060h, 
db 000h, O00h, O00h, 000h, 
db O57h, 064h, O000h, 058h, 
db OFFh 

db 000h, OOlh, 002h, 003h, 
db 006h, OO7h, O008h, 009h, 
db 00Ch, OODh, OOEh, OOFh, 
db OOFh, 0O00h 

db 000h, O000h, O000h, 000h, 
db 005h, OOFh, OFFh 


Mode 26 / VESA Mode 187 / 
Mode 27 / VESA Mode 188 / 
Mode 28 / VESA Mode 189 / 


See ere Ty 


004h 
OOAh 
OO01h 


000h 


, 


, 


, 


’ 


005h 
OOBh 
000h 


040h 


Mode 20 / VESA Mode 181 / *Internal Mode 22h* 
Mode 21 / VESA Mode 182 / Internal Mode 27h 
Mode 22 / VESA Mode 183 / Internal Mode 28h 


16M-color 
029h, O8Fh 
000h, OOO0h 
OEAh, O0OCh 
004h, OE3h 
004h, OO05h 
QOAh, OOBh 
OO0lh, 000h 
000h, 040h 


Mode 23 / VESA Mode 184 / *Internal Mode 23h* 
Mode 24 / VESA Mode 185 / Internal Mode 29h 
Mode 25 / VESA Mode 186 / Internal Mode 2Ah 


16M-color 


035 
000 
059 
073 


004 
OOA 
001 


000 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 


Ny 


01D 
000 
00D 
OE3 


PyYypyNy/ 


005h 
OOBh 
000h 


040h 


*Internal Mode 24h* 


Internal Mode 2Bh 
Internal Mode 2Ch 


512x384 - 256-color, 32K-color, 
65.000/2 MHz, 48.0 KHz, 60 Hz 
db 040h, 01Ch, OOEh 

dw OFFFFh 

db OOlh, OOFh, O000h, OOEh 
db O2Fh 

db O4Fh, O3Fh, 040h, 083h, 
db 024h, OF5h, 000h, 060h, 
db 000h, O00h, O000h, 000h, 
db OFFh, O080h, O000h, OFFh, 
db OFFh 

db 000h, OOlh, 002h, 003h, 
db 006h, OO07h, O008h, 009h, 
db 00Ch, OODh, OOEh, OOFh, 


16M-color 


042h 
000h 
003h 
025h 


004h 
OOAh 
041h 
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, 


, 


, 


, 


, 


’ 


, 


00Ch 
000h 
009h 
OE3h 


005h 


OOBh 
000h 
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(8x8 font, 


(8x8 font, 


(8x14 font, 


40x30 


50x37 


64x27 


"Text") 


"Text") 


"Text") 


Revision 1.01 
Printed 03/01/99 


nad 
Voodoo Banshee Universal Access 2d Databook 
db OOFh, O00O0Oh 
db 000h, OO0Oh, OO00h, OO00h, OO0Oh, O040h 
db 005h, OOFh, OFFh 
tbhlStdParameters label byte 


Mode 0 / Internal Mode 00h 
40x25 -— Color Text (8x8 Font) 
25.175 MHz, 31.5 KHz, 70 Hz 


Ne Ne Ne Ne Ne 


db O028h, 018h, O08h 
dw 00800h 
db 009h, 003h, 000h, 002h 
db 063h 
db O2Dh, O27h, O028h, O090h, O2Bh, OAOh 
db OBFh, O1Fh, 000h, OC7h, 006h, O007h 
db 000h, 000h, 000h, 000h, O09Ch, O8Eh 
db O8Fh, 014h, O1Fh, 096h, OB9h, OA3h 
db OFFA 
db 000h, OO1h, 002h, 003h, 004h, 005h 
db 006h, OO7h, 010h, O1lh, 012h, 013h 
db 014h, O15h, 016h, O017h, OO08h, OO00h 
db OOFh, 000h 
db 000h, 000h, 000h, 000h, 000h, 010h 
db OOEh, 000h, OFFh 

7 

; Mode 1 / Internal Mode Olh 

; 40x25 - Color Text (8x8 Font) 

: 25.175 MHz, 31.5 KHz, 70 Hz 

7 
db 028h, 018h, O08h 
dw 00800h 
db 009h, 003h, 000h, 002h 
db 063h 
db O2Dh, O27h, O028h, 090h, O2Bh, OAOh 
db OBFh, 0O1Fh, 000h, OC7h, 006h, O007h 
db 000h, 000h, 000h, 000h, O09Ch, O8Eh 
db O8Fh, 014h, O1Fh, 096h, OB9h, OA3h 
db OFFh 
db 000h, OO1h, 002h, 003h, 004h, O005h 
db Q06h, OO7h, 010h, O1lh, 012h, 013h 
db 014h, O015h, 016h, O017h, OO08h, OO00h 
db OOFh, 000h 
db 000h, 000h, 000h, 000h, 000h, 010h 
db OOEh, 000h, OFFh 

7 

; Mode 2 / Internal Mode 02h 

; 80x25 - Color Text (8x8 Font) 

; 25.175 MHz, 31.5 KHz, 70 Hz 

7 
db 050h, 018h, 008h 
dw 01000h 
db OO1lh, 003h, 000h, 002h 
db 063h 
db O5Fh, O4Fh, 050h, 082h, O55h, O81h 
db OBFh, O1Fh, 000h, OC7h, 006h, O07h 
db 000h, 000h, 000h, 000h, O09Ch, O8Eh 
db O8Fh, O028h, O1Fh, 096h, OB9h, OA3h 
db OFFh 
db 000h, OO01h, 002h, 003h, 004h, O005h 
db Q06h, OO7h, 010h, O1lh, 012h, 013h 
db 014h, O15h, 016h, O017h, OO08h, OO00h 
db OOFh, 000h 
db 000h, 000h, 000h, 000h, 000h, 010h 
db OOEh, 000h, OFFh 

7 

; Mode 3 / Internal Mode 03h 

; 80x25 - Color Text (8x8 Font) 

; 25.175 MHz, 31.5 KHz, 70 Hz 

7 
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nad 
db 050h, 018 
dw 01000h 
db O01lh, 003 
db 063h 
db O5Fh, O4F 
db OBFh, O1F 
db 000h, 000 
db O8Fh, 028 
db OFFh 
db 000h, O01 
db O006h, O07 
db 014h, 015 
db OOFh, 000 
db 000h, 000 
db OOEh, 000 


Ne Ne Ne Ne Ne 


320x200 - 4-color CGA (8x8 Font) 
25.175 MHz, 31.5 KHz, 70 Hz 

db O028h, 0O18h, OO0O8h 

dw 04000h 

db 009h, 003h, O000h, 002h 
db 063h 

db O2Dh, O27h, O028h, 090h, 
db OBFh, O1Fh, O0O00Oh, OC1lh, 
db 000h, O00h, O00h, 000h, 
db O8Fh, 014h, O000h, 096h, 
db OFFh 

db 000h, 013h, 015h, 017h, 
db 006h, OO7h, 010h, O11h, 
db 014h, 015h, 016h, 017h, 
db 003h, 000h 

db 000h, OO00h, O00h, 0O00h, 
db OOFh, OO0OOh, OFFh 


Ne Ne Ne Ne Ne 


320x200 - 4-color CGA (8x8 Font) 
25.175 MHz; 31.5. KHZ 10 (AZ 

db O028h, 0O18h, OO08h 

dw 04000h 

db 009h, 003h, 000h, 002h 
db 063h 

db O2Dh, O27h, O028h, 090h, 
db OBFh, O1Fh, O0O00Oh, OC1lh, 
db 000h, O000h, O000h, 0O00h, 
db O8Fh, 014h, O000h, 096h, 
db OFFh 

db 000h, 013h, 015h, 017h, 
db 006h, OO7h, 010h, O11h, 
db 014h, O15h, 016h, 017h, 
db 003h, 000h 

db 000h, O000h, O000h, 0O00h, 
db OOFh, O0O0Oh, OFFh 


Ne Ne Ne Ne Ne 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


ny 
ny 
ny 
nN 

oy 


Ny 


008h 
000h, 


050h, 
000h, 
000h, 
O1Fh, 


002h, 
010h, 
016h, 


000h, 
OFFA 


Mode 4 / Internal Mode 04h 


Mode 5 / Internal Mode 05h 


Mode 6 / Internal Mode 06h 
640x200 — 2-color CGA (8x8 Font) 


002h 


082h, 
OC7h, 
000h, 
096h, 


003h, 
Q1lh, 
017h, 


000h, 


25.175 MHz, 31.5 KHz, 70 Hz 

db 050h, 018h, 008h 

dw 04000h 

db OO1lh, O001h, 000h, 006h 
db 063h 

db O5Fh, O4Fh, 050h, 082h, 
db OBFh, O1Fh, 000h, OCl1h, 
db 000h, 000h, 000h, 000h, 
db O8Fh, 028h, 000h, 096h, 
db OFFh 


055h, 
006h, 
09Ch, 
OB9h, 


004h, 
012h, 
008h, 


000h, 


Q2Bh, 
000h, 
09Ch, 
OB9h, 


002h, 
012h, 
001h, 


000h, 


Q2Bh, 
000h, 
09Ch, 
OB9h, 


002h, 
012h, 
001h, 


000h, 


054h, 
000h, 
09Ch, 
OB9h, 
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PyYyrNDy 


080 
000 
08E 
OA2 


PyYyPryNDy) 


004h 
013h 
000h 


030h 


080 
000 
08E 
OA2 


PyPpyNpy/ 


004h 
013h 
000h 


030h 


080h 
000h 
08Eh 
OC2h 


22 
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nad 

db Q000h, O17h, 017h, 017h, 
db Q017h, O17h, O17h, 017h, 
db Q017h, O17h, O17h, 017h, 
db 001h, 000h 
db 000h, O00h, O00h, 0O00h, 
db OODh, OO00h, OFFh 

7 

; Mode 7 / Internal Mode 07h 

; 80x25 -— Mono Text (9x14 Font) 

; 28.321 MHz, 31.5 KHz, 70 Hz 

; 
db 050h, 018h, OOEh 
dw 01000h 
db 000h, 003h, O000h, 003h 
db OA6h 
db O5Fh, O4Fh, 050h, 082h, 
db OBFh, O1Fh, 000h, O04Dh, 
db 000h, O00h, O00h, 000h, 
db O5Dh, 028h, OODh, 063h, 
db OFFh 
db 000h, O008h, 008h, 008h, 
db 008h, O08h, 010h, 018h, 
db 018h, 018h, 018h, 018h, 
db OOFh, 008h 
db 000h, O000h, O000h, 0O00h, 
db OOAh, OO0Oh, OFFh 


Mode 5Bh / VBE Mode 100h 
Mode 29h / VBE Mode 18Ah 
Mode 2Ah / VBE Mode 18Bh 


Me Ne Ne Ne Ne Ne Ne 


db 050h, 018h, 
dw OFFFFh 

db OOlh, OOFh, 
db 063h 

db O5Fh, O4Fh, 
db OBFh, O1Fh, 
db 000h, 0OO0O0h, 
db O8Fh, 050h, 
db OFFh 

db 000h, OO1h, 
db 006h, OO07h, 
db 00Ch, OODh, 
db OOFh, 000h 
db 000h, 0O00h, 
db 005h, OOFh, 


Mode 5Fh / VBE Mode 101h 
Mode 6E / VESA Mode 111h 
Mode 69 / VESA Mode 112h 


See Ty 


db 050h, 01Dh, 
dw OFFFFhA 

db OO1lh, OOFh, 
db OE3h 

db O5Fh, 04Fh, 
db OOBh, 03Eh, 
db 000h, 000h, 
db ODFh, 050h, 
db OFFh 

db 000h, 00O1h, 
db 006h, 007h, 
db 00Ch, OODh, 
db OOFh, 000h 
db 000h, 000h, 


640X400 —- 256-color, 
25.175 MHz, 31.5 KHz, 


640X480 — 256-color, 
25-2175 \MAZy Bib KAZ 


017h, 
017h, 
001h, 


000h, 


055 
00B 
083 
OBA 


008 
018 
OOE 


000 


017h 
017h 
000h 


000h 


PyPryy 


/ *Internal Mode 08h* 
/ Internal Mode 2Dh 
/ Internal Mode 2Fh 
16M-color 


010 


050 
000 
000 
O1F 


002 
008 
OOE 


000 
OFF 


32K-color, 
70 Hz 


h, OQOEh 


h, 082h, 
h, 040h, 
h, 0O0Oh, 
h, 096h, 


h, 003h, 
h, 009h, 
h, OOFh, 


h, 000h, 


055 
000 
09C 
OB9 


004 
OOA 
041 


000 


081 
000 
OOE 
OE3 


PyryNpy) 


005h 
OOBh 
000h 


040h 


/ *Internal Mode 09h* 
/ Internal Mode 2Fh 
/ Internal Mode 30h 
16M-color 


000 


32K-color, 
70 Hz 


h, OQOEh 


h, 082h, 
h, 040h, 
h, 00OOh, 
h, OE7h, 


h, 003h, 
h, 009h, 
h, OOFh, 


h, 000h, 


052 
000 
OEA 
004 


004 
OOA 
041 


000 
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09 
000 
00C 
OE3 


PyPryNpy) 


005h 
OOBh 
000h 


040h 


(8x16 Font) 


(8x16 Font) 
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db 005h, OOFh, OFFh 


Mode 6Ah / VBE Mode 102h / Internal Mode OAh 
800X600 — 16-color (8x16 Font) 
40.000 MHz, 38.000 KHz, 60 Hz 


Ne Ne Ne ee 


db 064h, 024h, 010h 
dw OFAOOh 
db OO01lh, OOFh, 000h, 006h 
db O2Fh 
db O7Fh, 063h, 064h, O082h, O6Bh, O1Bh 
db 072h, OFOh, 000h, 060h, 000h, O000h 
db 000h, 000h, 000h, 000h, 059h, OODh 
db O57h, 032h, 000h, O057h, 073h, OE3h 
db OFFh 
db 000h, OO1h, 002h, 003h, 004h, O005h 
db 014h, OO07h, 038h, 039h, O3Ah, O3Bh 
db 03Ch, O3Dh, O3Eh, O03Fh, 001h, 000h 
db OOFh, 000h 
db 000h, 000h, 000h, 000h, 000h, 000h 
db 005h, OOFh, OFFh 

; 

; Mode 5Ch / VBE Mode 103h / *Internal Mode OBh* 

; Mode 70h / VBE Mode 114h / Internal Mode 31h 

; Mode 71h / VBE Mode 115h / Internal Mode 32h 

H 800X600 —- 256-color, 32K-color, 16M-color (8x16 Font) 

; 40.000 MHz, 38.000 KHz, 60 Hz 

7 
db 064h, 024h, 010h 
dw OFFFFh 
db OO1lh, OOFh, 000h, OOEh 
db O2Fh 
db O7Fh, 063h, 064h, 082h, 069h, 019h 
db 072h, OFOh, 000h, 060h, 000h, 000h 
db 000h, 000h, 000h, 000h, 059h, OODh 
db O57h, 064h, 000h, 058h, 073h, OE3h 
db OFFh 
db 000h, OO1h, 002h, 003h, 004h, O005h 
db 006h, OO7h, O008h, 009h, OOAh, OOBh 
db O0O0Ch, OODh, OOEh, OOFh, 001h, 000h 
db OOFh, 000h 
db 000h, 000h, 000h, 000h, 000h, 040h 
db 005h, OOFh, OFFh 


Mode 6B / VESA Mode 107 / *Internal Mode OCh* 
Mode 74 / VESA Mode 11A / Internal Mode 35h 
Mode 75 / VESA Mode 11B / Internal Mode 36h 
1280x1024 - 256-color, 32K-color, 16M-color (8x16 Font) 


See TT 


108.0 MHz, 64 KHz, 60 Hz 
db QAOh, O3Fh, 010h 
dw OFFFFh 
db OQOlh, OOFh, OO0Oh, OOEh 
db O2Fh 
db OCEh, O9Fh, OAOh, O91h, OA6h, 014h 
db O028h, O052h, 000h, 040h, O000h, O00h 
db 000h, O000h, O00h, O00h, OO1Lh, 004h 
db OFFh, OAOh, OO00Oh, OO1lh, 028h, OE3h 
db OFFh 
db Q00h, OO1lh, 002h, 003h, 004h, O005h 
db 006h, OO7h, OO8h, O009h, OOAh, OOBh 
db Q0O0Ch, OODh, OOEh, OOFh, O041h, 000h 
db OOFh, 00O0h 
db 000h, O0O00h, O00h, O00h, O00h, 040h 
db 005h, OOFh, OFFh 

; 

; Mode D / Internal Mode ODh 

; 320x200 —- 16-color planar (8x8 Font) 

; 25.175 MHz, 31.5 KHz, 70 Hz 
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we 
7 

db O28h, 018 
dw 02000h 
db 009h, OOF 
db 063h 
db O2Dh, 027 
db OBFh, O1F 
db 000h, 000 
db O8Fh, 014 
db OFFh 
db 000h, O01 
db O006h, O07 
db 014h, 015 
db OOFh, 000 
db 000h, 000 
db 005h, OOF 


Ne Ne Ne Ne Ne 


db 050h, 018 
dw 04000h 

db OO1lh, OOF 
db 063h 

db O5Fh, 04F 
db OBFh, O1F 
db 000h, 000 
db O8Fh, 028 
db OFFh 

db 000h, 001 
db O006h, 007 
db 014h, 015 
db OOFh, 000 
db 000h, 000 
db 005h, OOF 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


008 


000 


028 
000 
000 
000 


002 
010 
016 


000 
OFF 


Mode E / Internal Mode OEh 
640x200 -— 16-color planar 
25:.-17-9: MHZ. 31.5 KAZ; 


008 


000 


050 
000 
000 
000 


002 
010 
016 


000 
OFF 


006h 


090h, 
0cOh, 
000h, 
096h, 


003h, 
O1lh, 
017h, 


000h, 


(8x8 


70 Hz 


006h 


082h, 
0cOh, 
000h, 
096h, 


003h, 
Q1lh, 
017h, 


000h, 


Q2Bh, 
000h, 
09Ch, 
OB9h, 


004h, 
012h, 
OO01h, 


000h, 


Font) 


054h, 
000h, 
09Ch, 
OB9h, 


004h, 
012h, 
001h, 


000h, 


080 
000 
O8E 
OE3 


PYyPrNADy) 


005h 
013h 
000h 


000h 


080 
000 
O8E 
OE3 


PyYryNDy) 


005h 
013h 
000h 


000h 


7 
; Mode 5Eh / VBE Mode 105h / *Internal Mode OFh* 
; Mode 72h / VBE Mode 117h / Internal Mode 33h 
; Mode 73h / VBE Mode 118h / Internal Mode 34h 
; 1024X768 -— 256-color, 32K-color, 16M-color (8x16 Font) 
; 65.000 MHz, 48.500 KHz, 60 Hz 
7 
db 080h, O2Fh, 010h 
dw OFFFFh 
db OO1lh, OOFh, 000h, OOEh 
db O2Fh 
db QA3h, O7Fh, O80h, O87h, O083h, 094h 
db 024h, OF5h, 000h, 060h, 000h, 000h 
db 000h, O00h, O00h, O00h, 003h, 009h 
db OFFh, 080h, 000h, OFFh, 025h, OE3h 
db OFFh 
db 000h, OO1h, 002h, 003h, 004h, O005h 
db 006h, OO7h, 008h, 009h, OOAh, OOBh 
db 0O0Ch, OODh, OOEh, OOFh, O041h, 000h 
db OOFh, 000h 
db 000h, O00h, O000h, O00h, O000h, 040h 
db 005h, OOFh, OFFh 


Ne Ne Ne Ne 


80x60 Color Text (8x8 font) 

db 050h, O3Bh, 008h 

dw 2600h 

db OOlh, 003h, O000h, 002h 
db OE3h 

db O5Fh, O4Fh, 050h, 082h, 
db OOBh, O3Eh, 000h, 047h, 
db 000h, O000h, O000h, 000h, 


055h, 
006h, 
OEAh, 
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nad 

db ODFh, 028h, 
db OFFh 

db 000h, OO1h, 
db 014h, 0O07h, 
db 03Ch, O3Dh, 
db OOFh, 000h 
db 000h, 0O00h, 
db OOEh, 000h, 


Neo Ne Ne Ne Ne 


db 050h, 
dw 08000h 
db 001h, 
db OA2h 
db O5Fh, 
db OBFh, 
db 000h, 
db 05Dh, 
db OFFA 
db 000h, 
db 000h, 
db 000h, 
db 005h, 
db 000h, 
db 005h, 


Ne Ne Ne Ne Ne 


db 050h, 
dw 08000h 
db OO1h, 
db OA3h 
db O5Fh, 
db OBFh, 
db 000h, 
db 05Dh, 
db OFFh 
db 000h, 
db 014h, 
db 03Ch, 
db OOFh, 
db 000h, 
db 005h, 


Mode 0* / Internal 


Ne Ne Ne ee 


db 028h, 
dw 00800h 
db 009h, 
db OA3h 
db 02Dh, 
db OBFh, 
db 000h, 
db 05Dh, 
db OFFh 
db 000h, 
db 014h, 
db 03Ch, 
db OOFh, 
db 000h, 
db OOEh, 


’ 


OOF 


O4F 
O1F 
000 
028 


008 
000 
018 
000 
000 
005 


001 
007 
03D 
000 
000 
OOF 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


Mode 10h / Internal Mode 
640x350 - 16-bit planar 
25.175 MHz, 31.5 KHz, 70 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


01Fh, 


002h, 
038h, 
03Eh, 


000h, 
OFFh 


Mode F / Internal Mode 11h 
640x350 - 2-bit mono pseudo-planar 
25.175 MHz, 31.5 KHz, 70 


OOEh 
000h, 


050h, 
000h, 
000h, 
OOFh, 


000h, 
000h, 
000h, 


000h, 
OFFh 


12h 


OOEh 
000h, 


050h, 
000h, 
000h, 
OOFh, 


002h, 
038h, 
Q3Eh, 


000h, 
OFFh 


Mode 13h 


027 
O1F 
000 
014 


001 
007 
03D 
000 
000 
000 


40x25 - Color Text 
25.175 MHZ; 325 KAZ, 70 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


(8x14 


OOEh 
000h, 


028h, 
000h, 
000h, 
O1Fh, 


002h, 
038h, 
Q3Eh, 


000h, 
OFFh 


OE7h, 


003h, 
039h, 
03Fh, 


000h, 


Hz 


006 


082 
040 
000 
063 


000 
008 
000 


000 


(8x14 
Hz 


004h, 


004h, 
03Ah, 
00Ch, 


000h, 


054 
000 
083 
OBA 


018 
000 
00B 


000 


Font) 


054 
000 
083 
OBA 


004 
O3A 
001 


000 


02B 
00B 
083 
OBA 


004 
O3A 
008 


000 
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OA3h 


005h 
03Bh 
000h 


010h 


080 
000 
085 
OE3 


PyYryNDy 


018h 
000h 
000h 


000h 


080 
000 
085 
OE3 


PyYPrNDD 


005h 
03Bh 
000h 


000h 


PryYyryNADy) 
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Mode 1* / Internal Mode 14h 


Mode 2* / Internal 


Mode 


3* / Internal 


40x25 - Color Text 


(8x14 Font) 


25.175 MHz; 31.5. Kez; 70 Hz 


db 028h, 018h, 
dw 00800h 

db 009h, 003h, 
db OA3h 

db O2Dh, O027h, 
db OBFh, O1Fh, 
db 000h, 0O00h, 
db O5Dh, 014h, 
db OFFh 

db 000h, OO1h, 
db 014h, OO7h, 
db 03Ch, O3Dh, 
db OOFh, 000h 
db 000h, 0O00h, 
db OOEh, 000h, 


80x25 -— Color Text 


OOEh 
000h, 002h 


028h, 090h, 
Q000h, O4Dh, 
000h, OO0Oh, 
Q1Fh, 063h, 


002h, 003h, 
038h, 039h, 
03Eh, 03Fh, 


000h, 000h, 
OFFh 


Mode 15h 


(8x14 Font) 


25,175 MHz, 31.5 KHz, 70 Hz 


db 050h, 018h, 
dw 01000h 

db OO0lh, 003h, 
db OA3h 

db O5Fh, O4Fh, 
db OBFh, O1Fh, 
db 000h, 0O00h, 
db O5Dh, 028h, 
db OFFh 

db 000h, OO1h, 
db 014h, OO7h, 
db 03Ch, O3Dh, 
db OOFh, 000h 
db 000h, 0O00h, 
db OOEh, 000h, 


80x25 -— Color Text 


OOEh 
000h, 002h 


050h, 082h, 
000h, O4Dh, 
000h, 0O00h, 
O1Fh, 063h, 


002h, 003h, 
038h, 039h, 
03Eh, 03Fh, 


000h, 000h, 
OFFh 


Mode 16h 


(8x14 Font) 


25.175 MHz, Sl.5 KHz, 70 Hz 


db 050h, 018h, 
dw 01000h 

db OOlh, 003h, 
db OA3h 

db O5Fh, O4Fh, 
db OBFh, O1Fh, 
db 000h, 0O00h, 
db O5Dh, 028h, 
db OFFh 

db 000h, OO1h, 
db 014h, OO7h, 
db 03Ch, O3Dh, 
db OOFh, 000Oh 
db 000h, 0O00h, 
db OOEh, 000h, 


OOEh 
000h, 002h 


050h, 082h, 
000h, O4Dh, 
000h, 0OO00h, 
Q1Fh, 063h, 


002h, 003h, 
038h, 039h, 
03Eh, 03Fh, 


000h, 000h, 
OFFh 


Mode 0+/1+ / Internal Mode 17h 


40x25 -— Color Text 


(9x16 Font) 


28.321 MHz, 31.5 KHz, 70 Hz 


db 028h, 018h, 
dw 00800h 

db 008h, 003h, 
db 067h 

db 02Dh, 027h, 


010h 
000h, 002h 


028h, 090h, 


Q2Bh, 
QOBh, 
083h, 
OBAh, 


004h, 
Q3Ah, 
008h, 


000h, 


055h, 
OOBh, 
083h, 
OBAh, 


004h, 
Q3Ah, 
008h, 


000h, 


055h, 
QOBh, 
083h, 
OBAh, 


004h, 
Q3Ah, 
008h, 


000h, 


02Bh, 
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db OBFh, O1Fh, OO00Oh, O4Fh, OODh, OOEh 
db 000h, OOOh, OO00h, OO00h, O9Ch, O8Eh 
db O8Fh, 014h, O01Fh, O096h, OB9h, OA3h 
db OFFh 
db 000h, OO1lh, O002h, O003h, 004h, O005h 
db 014h, OO7h, O38h, O039h, O3Ah, O3Bh 
db 03Ch, O3Dh, O3Eh, O3Fh, OOCh, O000h 
db OOFh, OO08h 
db 000h, OOOh, OO00h, O00h, OO00Oh, 010h 
db OOEh, OO0Oh, OFFh 


Mode 2+/3+ / Internal Mode 18h 
80x25 -— Color Text (9x16 Font) 
28.321 MHz, 31.5 KHz, 70 Hz 


Ne Ne Ne Ne Ne 


db 050h, 018h, 010h 
dw 01000h 
db 000h, 003h, O000h, 002h 
db 067h 
db O5Fh, O4Fh, 050h, 082h, O55h, O81h 
db OBFh, O1Fh, OO00h, O4Fh, OODh, OOEh 
db 000h, OO00h, OO00h, O00h, O9Ch, O8Eh 
db O8Fh, O028h, O1Fh, 096h, OB9h, OA3h 
db OFFh 
db 000h, OOlh, 002h, 003h, 004h, OO05h 
db 014h, OO7h, O038h, 039h, O3Ah, O3Bh 
db 03Ch, O3Dh, O3Eh, O3Fh, O00Ch, 000h 
db OOFh, 008h 
db 000h, O00h, O00h, O00h, O00h, 010h 
db OOEh, O00h, OFFh 

7 

; Mode 7+ / Internal Mode 19h 

; 80x25 - Mono Text (9x16 Font) 

; 28.321 MHz, 31.5 KHz, 70 Hz 

7 
db 050h, 018h, 010h 
dw 01000h 
db 000h, 003h, O000h, 002h 
db 066h 
db O5Fh, O4Fh, 050h, 082h, O055h, O81h 
db OBFh, O1Fh, O000h, O4Fh, OODh, OOEh 
db 000h, O00h, O000h, O00h, O9Ch, O8Eh 
db O8Fh, O028h, OOFh, 096h, OB9h, OA3h 
db OFFh 
db 000h, O08h, O08h, O008h, O008h, 008h 
db 008h, O08h, 010h, 018h, 018h, 018h 
db 018h, 018h, 018h, 018h, OOEh, 000h 
db OOFh, 008h 
db 000h, O00h, O00h, O00h, O00h, 010h 
db OOAh, OO00h, OFFh 

; 

; Mode 11h / Internal Mode 1Ah 

: 640x480 -— 2-color planar (8x16 Font) 

; 25.175 MHz, 31.5 KHz, 60 Hz 

7 
db 050h, 0O1Dh, 010h 
dw OA000h 
db OOlh, OOFh, O000h, O06h 
db OE3h 
db O5Fh, O4Fh, 050h, 082h, 054h, 080h 
db OOBh, O3Eh, O000h, O040h, 000h, 000h 
db 000h, O00h, O00h, OO00h, OEAh, O8Ch 
db ODFh, 028h, O000h, OE7h, 004h, OC3h 
db OFFh 
db 000h, O3Fh, O3Fh, O3Fh, O3Fh, O3Fh 
db O3Fh, O3Fh, O3Fh, O3Fh, O3Fh, O3Fh 
db O3Fh, O3Fh, O3Fh, O3Fh, OO0lh, 000h 
db O0lh, 000h 
db 000h, O00h, O000h, O00h, O000h, O000h 
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db 005h, OO1h, 


Nee Ne Ne Ne 


db 050h, 01D 
dw OA000h 

db OO1lh, OOF 
db OE3h 

db O5Fh, 04F 
db OOBh, 03E 
db 000h, 000 
db ODFh, 028 
db OFFA 

db 000h, 001 
db 014h, 007 
db 03Ch, 03D 
db OOFh, 000 
db 000h, 000 
db 005h, OOF 


Ne Ne Ne Ne Ne 


db 028h, 018 
dw 02000h 

db OOlh, OOF 
db 063h 

db O5Fh, 04F 
db OBFh, O1F 
db 000h, 000 
db O8Fh, 028 
db OFFh 

db 000h, 001 
db 006h, 007 
db 00Ch, OOD 
db OOFh, 000 
db 000h, 000 
db 005h, OOF 


Ne Ne Ne Ne 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


25.175 MHz, 31.9 KHz, 


Ny 


Ny 


Ny 
Ny 
Ny 
Ny 


Ny 
Ny 
Ny 
n 
Ny 
Ny 


OFFh 


Mode 12h / Internal Mode 1Bh 
640x480 -— 16-color planar 
25.175 MHz, 31.5 KHz, 


010 


050 
000 
000 
000 


002 
038 
03E 


000 
OFF 


Mode 13h / Internal Mode 1Ch 
320x200 - 256-color 


(8x8 


008 


000 


050 
000 
000 
040 


002 
008 
OOE 


000 
OFF 


70 


(8x16 Font) 
60 Hz 


054 
000 
OEA 
004 


004 
O3A 
001 


000 


054 
000 
09C 
OB9 


004 
OOA 
041 


000 
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a: 2D 
2D Register Map 


Memory Base 0: Offset 0x0100000 


| Register Name | Address | Reg | Bits |R/W| Description 
[status | 0x000(0) | 0x0_| 31:0 | R__| Banshee status register 


strcXY Ox05c(92) Ox17 | 28:0 | R/W_ | Starting pixel of blt source data 
Starting position for lines 
Top-most point for a polygon fill 


0x060(96) 
0x064(100) 


rectangle fills 
End point for lines 


| RESERVED __| 0x074(116) [| OxID | 31:0 | | Donotwrite 
| RESERVED __| 0x078(120) | OxIE | 31:0 | | Donotwrite 
| RESERVED __| 0x070(124) | OxIF | 31:0 | | Donotwrite 


launchArea 0x080(128) 0x20 | 31:0 Initiates 2D commands 
to to 
OxOff(255) Ox3F 
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colorPattern 0x 100(256) 0x40 | 31:0 | R/W | Pattern Registers (64 entries) 
to to 
Ox 1fc(508) Ox7F 
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Register Descriptions 


The 2D register set is described in the sections below. 


All 2D registers can be read, and all registers except for the status register are fully write-able. Reading a 
2D register will always return the value that will be used if a new operation is begun without writing a 
new value to that register. This value will either be the last value written to the register, or, if an operation 
has been performed since the value was written, the value after all operations have completed. 


All registers for the 2D section are unsigned unless specified otherwise. 


status Register 
The status register provides a way for the CPU to interrogate the graphics processor about its current state 


and FIFO availability. The status register is read only, but writing to status clears any Banshee generated 
PCI interrupts. 


lo. Vertical retrace (0=Vertical retrace active, 1=Vertical retrace inactive). Default is 1. 
[8 | TREX busy (O=engine idle, 1=engine busy). DefaultisQ. 


[10 2D busy (O=idle, 1=busy). Default is 0. 


27:12 
30:28 Swap Buffers Pending. Default is 0x0. 
Ln PCI Interrupt Generated. Default is 0x0. (not currently implemented). 


Bits(5:0) show the number of entries available in the internal host FIFO. The internal host FIFO is 64 
entries deep. The FIFO is empty when bits(5:0)=0x3f. Bit(6) is the state of the monitor vertical retrace 
signal, and is used to determine when the monitor is being refreshed. Bit(7) of status is used to determine 
if the graphics engine of FBI is active. Note that bit(7) only determines if the graphics engine of FBI is 
busy -- it does not include information as to the status of the internal PCI FIFOs. Bit(8) of status is used 
to determine if TREX is busy. Note that bit(8) of status is set if any unit in TREX is not idle -- this 
includes the graphics engine and all internal TREX FIFOs. Bit(9) of status determines if all units in the 
Banshee system (including graphics engines, FIFOs, etc.) are idle. Bit(9) is set when any internal unit in 
Banshee is active (e.g. graphics is being rendered or any FIFO is not empty). When the Memory FIFO is 
enabled, bits(27:12) show the number of entries available in the Memory FIFO. Depending upon the 
amount of frame buffer memory available, a maximum of 65,536 entries may be stored in the Memory 
FIFO. The Memory FIFO is empty when bits(27:12)=Oxffff. Bits (30:28) of status track the number of 
outstanding SWAPBUFFER commands. When a SWAPBUFFER command is received from the host 
cpu, bits (30:28) are incremented -- when a SWAPBUFFER command completes, bits (30:28) are 
decremented. Bit(31) of status is used to monitor the status of the PCI interrupt signal. If Banshee 
generates a vertical retrace interrupt (as defined in pcilnterrupt), bit(31) is set and the PCI interrupt 
signal line is activated to generate a hardware interrupt. An interrupt is cleared by writing to status with 
“dont-care” data. NOTE THAT BIT(3 1) IS CURRENTLY NOT IMPLEMENTED IN HARDWARE, AND WILL ALWAYS 
RETURN OXO. 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
32 Printed 03/01/99 


10 
11 Reserved 
31 


er 


Voodoo Banshee Universal Access 2d Databook 


intrCtrl Register 


The intrCtrl register controls the interrupt capabilities of Banshee. Bits 1:0 enable video horizontal sync 
signal generation of interrupts. Generated horizontal sync interrupts are detected by the CPU by reading 
bits 7:6 of intrCtrl. Bits 3:2 enable video vertical sync signal generation of interrupts. Generated vertical 
sync interrupts are detected by the CPU by reading bits 9:8 of intrCtrl. Bit 4 of intrCtrl enables 
generation of interrupts when the frontend PCI FIFO is full. Generated PCI FIFO Full interrupts are 
detected by the CPU by reading bit 10 of intrCtrl. PCI FIFO full interrupts are genered when intrCtrl bit 
4 is set and the number of free entries in the frontend PCI FIFO drops below the value specified in 
fbilnit0 bits(10:6). Bit 5 of intrCtrl enables the user interrupt command USERINTERRUPT generation 
of interrupts. Generated user interrupts are detected by the CPU by reading bit 11 of intrCtrl. The tag 
associated with a generated user interrupt is stored in bits 19:12 of intrCtrl. 


Generated interrupts are cleared by writing a 0 to the bit signaling a particular interrupt was generated 
and writing a | to interCtrl bit(31). For example, a PCI FIFO full generated interrupt is cleared by 
writing a 0 to bit 10 of intrCtrl, and a generated user interrupt is cleared by writing a 0 to bit 11 of 
intrCtrl. For both cases, bit 31 of intrCtrl must be written with the value 1 to clear the external PCI 
interrupt. Care must be taken when clearing interrupts not to accidentally overwrite the interrupt mask 
bits (bits 5:0) of intrCtrl) which enable generation of particular interrupts. 


Note that writes to the intrCtrl register are not pushed on the PCI frontend FIFO, so writes to intrCtrl 
are processed immediately. Since intrCtrl is not FIFO’ed, writes to intrCtrl may be processed out-of- 
order with respect to other queued writes in the PCI and memory-backed FIFOs. 


0 | Horizontal Syne (rising edge) interrupts enable (I=enable). Default is 0. 
1 | Horizontal Syne (falling edge) interrupts enable (I=enable). Defaultis0. 
2 Vertical Syne (rising edge) interrupts enable (I=enable). Default is 0. 


User Interrupt Command interrupt generated (1=interrupt generated). 
User Interrupt Command Tag. Read only. 
Hole counting interupts enable (1=enable). Default is 0. 


External pin pci_inta value, active low (O=PCI interrupt is active, 1=PCI interrupt is 
inactive) 
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command Register 


The command register sets the command mode for the 2D engine, as well as selecting a number of 
options. 


Bits (3:0) set the command mode for the 2D drawing engine as shown in the table below. If bit(8) is set, 
the command will be initiated as soon as the command register is written. If bit(8) is cleared, drawing 
will be initiated by a write to the launch area. For descriptions and examples of each command, see the 
2D launch area section. 


pO Nop-waitforidle 
plo Screen to screen bit 
}2 | Screen to screen stretch bit | 


Line 


Rectangle fill 


Setting Bit(9) makes line drawing reversible. If this bit is set, drawing a line from point A to point B will 
result in the same pixels being drawn as drawing a line from point B to point A. 


Bits(11:10) control the value placed in dstXY after each blt or rectangle fill command is executed. If 
bit(10) is 0, dst_x is unchanged. If bit(10) is 1, dst_x gets dst_x + dst_width. If bit(11) is 0, dst_y is 
unchanged. If bit(11) is 1, dst_y gets dst_y + dst_height. 


Bit(12) controls whether lines are stippled or solid. If bit(12) is 0, lines will be a solid color. If bit(12) is 
1, lines will either be made up of either a two color pattern using colorFore and colorBack or will be a 
transparent stipple using colorFore, as determined by the transparency bit - bit(16). 


Bit(13) controls the format of the pattern data. If bit(13) is set to 0, the pattern must be stored in the 
destination format. If it is set to 1, the pattern will be stored as a monochrome bitmap; Pattern registers 0 
and 1 will be used as an 8x8x1bpp pattern, which will be expanded into the destination format using the 
colorBack and colorFore registers. Note that if Bit(13) is set, and Bit(16) is set to indicate that 
monochrome data is transparent, the pattern will be used to determine pixel transparency without regard 
to the contents of the ROP register. 


Bits(15:14) control the direction of blting during screen-to-screen copies. Note that the corner of the 
source and destination rectangles passed in the sreXY and dstXY registers will change depending on the 
blting direction. Bit(15) also controls the direction of blting for host-to-screen copies. This can be used to 
flip a pixel map so that the top span in host memory is drawn as the bottom span on the screen. Note that 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
34 Printed 03/01/99 


er 


Voodoo Banshee Universal Access 2d Databook 


the direction bits only apply to “pure” screen to screen blits, but not to stretch blits. Also, destination and 
source color keying, along with color conversions, cannot be used with right to left blits. 


Bit(16) controls whether monochrome source bitmaps, and monochrome patterns will be transparent or 
opaque. When bit(16) is 0, source bitmaps are opaque; a 0 in the bitmap will result in colorBack being 
written to the destination. When bit(16) is 1, source bitmaps and monochrome patterns are transparent. 
In this case, a 0 in the bitmap will result in the corresponding destination pixel being left unchanged. 


The X and Y pattern offsets give the coordinates within the pattern of the pixel which corresponds to the 
destination pixel pointed to by the destination base address register. In other words, if a pattern fill is 
performed which covers the origin, pixel (0,0) in the destination pixel map will be written with the color 
in pattern pixel (x_pat_offset, y_pat_offset). 


Bit(23) controls whether the clipO or clip1 registers will be used for clipping. When bit(31) is 0, clipping 
values from clipOMin and clip0Max will be used, when bit(31) is 1, clipping values from clip1Min and 
clip1Max will be used. 


Bits(31:24) contain ROPO, the ternary ROP that is used when colorkeying is disabled. For more 
information on ROPs, see the description of the rop register. 


Command 
————e) Initiate command (1=initiate command immediately, 0 = wait for launch write) 
|9 S| Reversible lines (I=reversible, Q=non-reversible) 
Stipple line mode (1 = stippled lines, 0 = solid lines) 
Pattern Format (1 = monochrome, 0 = color) 


Y direction (0 = top to bottom, 1 = bottom to top) 
Transparent monochrome (1 = transparent, 0 = opaque) 


commandExtra Register 


This register contains miscellaneous control bits in addition to those in the command register. 


Bits(1:0) enable colorkeying, if the bit is 0, colorkeying is disabled. Enabling source colorkeying with 
monochrome source, or in line, polyline, polygon, or rectangle modes has no effect. For further 
explanation of these bits, see the description of the colorkey registers. 


If bit(2) is set, the current command, and any following it will not proceed until the next vertical blanking 
period begins. Wait for Vsync should not be used when performing non-DMA host blts. 
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If bit(3) is set, only row 0 of the pattern will be used, rather than the usual 8 pattern rows. 


Command 
Description 
Enable source colorkey (1=source colorkeying enabled, 0=source colorkeying disabled) 


Enable destination colorkey (1=enable dst colorkeying, 0O=disable dst colorkeying) 
Wait for Vsync (1=wait for vsync, 0=execute immediately) 
Force pattern row 0 (1 = use only row 0, 0 = use all 8 pattern rows) 


colorBack and colorFore Registers 


The colorBack and colorFore registers specify the foreground and background colors used in solid-fill 
and monochrome bitmap operations, and operations using a monochrome pattern. The color registers 
must be stored in the destination color format. 


The following tables shows the format of the color registers for each destination format. 


P = palette index 

R =red color channel 

G = green color channel 
B = blue color channel 


Dst Format Bits stored 
8 bpp 0000_0000_0000_0000_0000_0000_PPPP_PPPP 
15 bpp 0000_0000_0000_0000_ARRR_RRGG_GGGB_BBBB 


16 bpp 0000_0000_0000_0000_RRRR_RGGG_GGGB_BBBB 
24 bpp 0000_0000_RRRR_RRRR_GGGG_GGGG_BBBB_BBBB 
32 bpp 


colorFore 
Bit 
foreground color 


colorBack 
Bit 
background color 


Pattern Registers 


The pattern registers contain an 8 pixel by 8 pixel pattern. The pixels must be either in the color format 
of the destination surface, or in | bpp (monochrome) format. The pixels are to be written to the pattern 
registers in packed format. So, only registers 0 and 1 will be used for monochrome patterns, registers 0 
through 15 will be used when the destination is 8 bpp, registers 0 through 31 will be used when the 
destination is 16 bpp. 
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Pixels should be written into the pattern registers starting with the upper left-hand corner of the pattern, 
proceeding horizontally left to right, and then vertically top to bottom. The least-significant bits of 
pattern[0] should always contain pixel(0,0) of a color pattern. 


The table below give the bit position of monochrome pixels within the pattern registers. The bits are 
numbered such that bit(0) represents the Isb of a register, and bit(31) represents the msb. 


Order of pixel storage in the pattern registers for a monochrome pattern 


pattern((0) 
Row 0 Wp MG iS], YS oe ts [Oe | 
Row | e150 18 42 | A 105/98 | 
Row 2 
Row 3 

pattern(1) 
Row 4 Para ee eee al 
Row 5 (15 [14] 13 | 12] 11 | 10/9 [8 | 
Row 6 
Row 7 


pattern(0-64) 
Bit 


srcBaseAddr and dstBaseAddr Registers 


Bits(23:0) of these registers contain the addresses of the pixels at x=0, y=0 on the source and destination 
surfaces in frame-buffer memory. Bit(31) of each register is reserved and must be zero. 


The srcBaseAddr register is used only for screen-to-screen blts. For host-blts, the alignment of the initial 
pixel sent from the host is determined by the x entry in the srcXY register. 


For YUYV422 and UYVY422 surfaces, the base address must be dword aligned. Thus bits(1:0) of 
srcBaseAddr must be 0. 


SrcBaseAddr 


dstBaseAddr 


Destination base address 
30:24 RESERVED 
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Reserved (must be 0) 


srcSize and dstSize Registers 


These registers are used only for blts and rectangle fills. They contain the height and width in pixels of 
the source and destination rectangles. The srcSize register will only be used in Stretch-blt modes. For 
non-stretched bits, the blt source size will be the same as the blt destination size, determined by the 
dstSize register. 


srcSize 
Bit Description 


Blt Source Width 


15:13 RESERVED 
28:16 Blt Source Height 
31:29 RESERVED 


dstSize 
Bit Destination Width 


15:13 RESERVED 
28:16 Bit Destination Height 
31:29 RESERVED 


srcXY and dstXY Registers 


During screen-to-screen blts, the sreXY registers sets the position from which blt data will be read. Note 
that the starting position for a blit depends on the direction of the blt as shown in the table below. For 
lines and polylines, sreXY is the starting point of the first line segment. For polygons, the sreXY should 
be the topmost vertex of the polygon - that is, the vertex with the lowest y value. If there are multiple 
vertices sharing the lowest y value, the sreXY should be set to the leftmost vertex with that y value. 
Reading the sreXY register while in polygon mode will always return the last polygon vertex that the host 
sent for the left side of the polygon. 


The values in the sreXY are signed, however for blts sreXY must contain only positive values. 


During host-to-screen blts, only the x entry of the sreXY register is used. This entry determines the 
alignment of the initial pixel in the blt within the first dword sent from the host. For monochrome 
bitmaps, bits[4:0] are used to determine the bit position within the dword of the initial pixel. For color 
bitmaps, bits[1:0] are give the position within the dword of the first byte of pixel data. Host blts are 
always performed left-to-right (the x-direction bit in the command register is ignored), so the offset given 
will always be that of the leftmost pixel in the first span. The alignment of the initial pixel of all spans 
after the first is determined by adding the source stride (from the srcFormat register) to the alignment of 
the previous span. 


For bits, the dstXY should be the starting pixel of destination rectangle as shown in the table below. For 
line and polyline modes, the dstXY will be the endpoint of the first line segment. 
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In polygon mode, the dstXY register is used to store the last vertex sent for the right side of the polygon. 
If command[8] is set when the command register is written in polygon mode, the value from sreXY will 
be copied to dstXY. If command[8] is cleared, dstXY can be written with the rightmost pixel in the top 
span of the polygon. 


Command[15:14] 
foo. (Upper Left-hand corner 


Upper Right-hand corner 
Lower Left-hand corner 
Lower Right-hand corner 


dstXY 


srcXY 
Signed x position of the first source pixel 


15:14 RESERVED 
28:16 Signed y position of the first source pixel 
31:30 RESERVED 


srcFormat and dstFormat Registers 


These register specify the format and strides of the source and destination surfaces 


For linear surfaces, the stride of a pixel map is the number of bytes between the starting addresses of 
adjacent scan lines. For these surfaces, the units of the stride is always bytes, regardless of the pixel 
format. 


The number of bits per pixel is determined as described by the tables below. The ’32 bpp’ format contains 
24 bits of RGB, along with a byte of unused data, the ’24 bpp’ is packed 24 bit color. 


Data coming through the host port can be byte swizzled to allow conversion between big and little endian 
formats, as selected by Bit 19 and 20 of src Format register. If both byte and word swizzling are enabled, 
the byte swizzling occurs first, followed by word swizzling. 


The source packing bits control how the stride of the source will be determined during bits. If both bits 
are zero, the stride is set by the stride entry. Otherwise, the stride is based off of the width of the blt being 
performed, as shown in the table below. The stride will equal the number of bytes in a row of the 
rectangle being blted plus as many bytes as are required to get the necessary alignment. 


For YUYV422 and UYVY422 source formats, linear strides must always be a dword multiple. Thus, 
bits(1:0) of the srcFormat register must be 0. 
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When necessary, the blt engine will convert source pixels to the destination format. 


When source pixels in |5bpp or 16bpp format are converted to 24bpp or 32bpp, color conversion is 
performed by replicating the msbs of each channel into the extra Isbs required. When pixels are 
converted from 32bpp or 24bpp formats to 15 or l6bpp, 16bpp, the extra Isbs are removed from each 
channel. When any non-32bpp format is converted to 32bpp, the 8msbs of each pixel (i.e. the alpha 
channel) are filled with zeros. 


Destination pixel formats are stored as shown in the description of the colorFore and colorBack registers. 
RGB source formats match these, the other source formats are shown in the table below. For monochrome 
source, pO represents the leftmost pixel on the screen and p31 represents the rightmost. For YUV formats, 
ya represents the left pixel and yb represents the pixel to the right of ya, etc. Thus, ya7 is the msb of the y 
channel for the left pixel and ya0 is the Isb of the y channel for that pixel. In the diagram, the dword with 

the lower address (which will be quadword aligned) is shown first, followed by the dword with the higher 

address. 


Source formats 


Monochrome 

p24 p25 p26 p27 p28 p29 p30 p31 pl6 pl7 pl8 pl9 p20 p21 p22 p23 p8 p9 pl0 pll pl2 p13 pl4 p15 pO pl p2 p3 p4 p5 p6 p7 
UYVY 4:2:2 

yb7 yb6 yb5 yb4 yb3 yb2 ybl yb0 v7 v6 v5 v4 v3 v2 v1 vO ya7 ya6 ya5 ya4 ya3 ya2 yal yaO u7 u6 u5 u4 u3 u2 ul uO 
YUYV 4:2:2 

v7 v6 v5 v4 v3 v2 vl vO yb7 yb6 yb5 yb4 yb3 yb2 yb1 yb0 u7 u6 uS u4 u3 u2 ul UO ya7 ya6 ya5 ya4 ya3 ya2 yal yaO 


Methods of color translation used for Blts 


1bpp src Sbpp src 15bpp sre 16bpp src 24bpp src 32bpp sre YUV 
src 
S8bpp dst | color direct or not supported | not supported | not not supported | not 
registers palette supported supported 


15bpp dst | color not direct Isb removal Isb removal | Isb removal, 
registers supported alpha 
dropped 
msb 


16bpp dst | color not direct Isb removal | Isb removal, 
registers supported duplication alpha 
dropped 
24bpp dst | color not msb msb direct direct, 
registers supported duplication duplication alpha 
dropped 


32bpp dst | color not msb msb rgb direct, direct YUV => 
registers supported duplication, duplication, zero alpha RGB 
zero alpha zero alpha zero alpha 
srcFormat 
Bit Description 
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13:0 Source Stride in bytes 
RESERVED 
Source color format: 1, 8, 16, 24, 32 bpp RGB, YUYV422, UYVY422 


Host port byte swizzle (1=enable) 


Host port word swizzle (1=enable) 


Source packing 
RESERVED 


dstFormat 
Description 


[Bit 


srcFormat Source Format 
[19:16] 


fo A bpp mono 
[8 | packed 4:2:2 YUYV | 
}9 | packed 4:2:2 UYVY | 


srcFormat[23:22] Stride calculation 
[OC Use stride register__| srcFormat{ 13:0] 


Byte packed ceil(src_width * src_bpp/8) 
Word packed ceil(src_width * src_bpp/16)*2 
Double-word packed | ceil(src_width * src_bpp/32)*4 


clipOMin, clip0Max, clip1Min, and clip1Max Registers 

The clip registers define the maximum and minimum x & y values of pixel that can be written in the 
destination pixel map. There are two sets of clip registers, however, only one set is used at a time, as 
determined by the clip select bit in the command register. 


Clipping is inclusive of the minimum values, and exclusive of the maximum values. Thus if the clip 
select bit is zero, only pixels with x values in the range [clipOMin_x, clipOMax_x) and y values in the 
range [clipOMin_y, clipOMax_y) will be written. 


clipOMin 


X minimum clip when clip select is 0 
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15:12 RESERVED 


27:16 y minimum clip when clip select is 0 
31:28 RESERVED 


Bit Description 


X maximum clip when clip select is 0 
RESERVED 
y maximum clip when clip select is 0 
RESERVED 


clip1Max 
X maximum clip when clip select is 1 


15:12 RESERVED 
27:16 y maximum clip when clip select is 1 
31:28 RESERVED 


colorkey Registers 


These registers define the range of colors that will be transparent when color keying is enabled. 


Different ROPs are selected for each pixel depending the result of that pixels colorkey test. A source pixel 
passes the colorkey test if it is within the inclusive range defined by the sreColorkeyMin and 
srcColorkeyMax registers. A destination pixel passes the colorkey test if it is within the inclusive range 
defined by the dstColorkeyMin and dstColorkeyMax registers. 


For Pixels with 8bpp formats, the color indices are compared directly. For pixels with 16, 24, or 32bpp 
formats, each color channel (R, G, and B) is compared separately, and each channel must pass for the 
colorkey test to be passed. In the 32bpp format, the upper 8 bits are ignored during colorkey testing. 
Source colorkeying cannot be enabled if the source format is | bpp. 


If colorkeying is disabled for the source or destination surfaces, that colorkey test is failed. 


For further information on ROP selection by the colorkey test results, see the description of the ROP 
register. 


The colorkey test uses the following formula: 
pass = (((color>=colorkey_min) && (color<=colorkey_max)) && colorkey_enable) 
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srcColorkeyMin 
Bit Description 


minimum color key value for source pixels 
31:24 RESERVED 


srcColorkeyMax 
Bit Description 


maximum color key value for source pixels 
31:24 RESERVED 


dstColorkeyMin 


dstColorkeyMax 


rop Register 

This is a set of ternary ROPs used to determine how the source, destination, and pattern pixels will be 
combined. The default ROP, ROPO is stored in the command register. Which of the four ROPs will be 
used is determined on a per-pixel basis, based on the results of the source and destination colorkey tests, 
as shown in the following table: 


Source Color | Destination Color 
Bey Test ee Test 


ROP 0 


ROP 1 
ROP 2 
ROP 3 


rop 


23:16 ROP 3 


lineStyle register 


The lineStyle register specifies how lines will be drawn. 
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The bit pattern used for line stippling can be set to repeat every 1-32 bits, as set by the bit-mask size part 
of this register. The bit-mask size entry gives the number of bits *minus one* that will be used from the 
lineStipple register. Thus, if you want to use 2 bits to represent a dashed line, you would set the bit-mask 
size to I. 


Each bit from the lineStipple register will determine the color or transparency of from 1-256 pixels. The 
repeat count determines the number of pixels along the line that will be drawn (or skipped) for each bit in 
the line pattern register. The number of pixels associated with each bit of the line pattern *minus one* 
must be written to the repeat count entry. 


The start position give the offset within the line pattern register for the first pixel drawn in a line. It 
consists of an integer index of the current bit in the line pattern, and a fractional offset that will determine 
the number of pixels that will be drawn using that bit of the pattern. The number of pixels drawn using 
the initial bit in the line pattern will equal the repeat count (i.e. the repeat count entry+1) minus the 
fractional part of the start position. The bit positions within the lineStipple registers are numbered 
starting with the Isb at 0, going up to the msb at 31. 


It is illegal to set the integer part of the stipple position to be greater than the bit-mask size. It is illegal to 
set the fractional part to be greater than the repeat count. If either part of the stipple position is too large, 
the behavior of the line drawing engine is undefined. 


Writing the lineStyle register will cause the stipple position to be loaded from the register. If the 
lineStyle register is not written to between the execution of two line commands, the stipple position at the 
start of the new line will be whatever if was after the completion of the last line. If the lineStyle register 
is read while the 2D engine is idle, the stipple position read will always be that which will be used in the 
next line operation - thus, if the lineStyle register has been written since the last stippled line was drawn 
the value written will be returned, otherwise the value that remained after the last stippled line will 
returned. Reading the lineStyle register while the 2D engine is not idle will return an indeterminable 
value for the stipple position. 


In the following examples,. ‘x’ represents a pixel colored with colorFore, ‘o’ represents a pixel colored 
with colorBack or that is transparent. ‘_S_’ Shows that the line engine is starting at bit 0 in the 
lineStipple register. ‘_’ shows that the line engine is using a new bit from the lineStipple register. 


Example 


Say the bit-mask size is set to 6 (thus, the entry in the register is 5) and the line pattern is: 
lineStipple <= 010111b 


The pixel pattern that will be repeated is: 


repeat_count repeating pixel pattern 

1 X_X_X_O_X_0O_S_X xX xX 0 xX_O 

2 XX_XX_XX_OO_XX_O0O0_S_XX_XX_XX_00_XX_00 

3 XXX_XXX_XXX_OOO_XXX_000_S_XXX_XXX_XXX_O00_XXX_000 
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Example 

Say the repeat count is 5 (the register entry is 4), the integer part of the start position is 7, and the 
fractional part of the start position is 2. The color of the first 3 pixels drawn for the line will be 
determined by bit 7 in the line pattern register, the next 5 pixels will be determined by bit 8, and so on. 


lineStyle <= 07020904h 
lineStipple <= 1010110111b 


pixels generated, where x=colorFore and o=colorBack: 


XXX_OOOOO_XXXXX_S_XXXXX_XXXXX_XXXXX_OOOOO_XXXXX_XXXXX_OOOO0_XXXXX_OO0O000_XXXXx_S 


Pseudo code for line pixel generation 


Here is the pseudo-code for determining the color of pixels generated by the line engine: 


<bit_position> = <start_position_integer> 
<pixel_position> = <start_position_fraction> 


while (<need_another_pixel>) { 
if ( <line_pattern> & (1 << <bit_position>) ) { 
<new_pixel_color> = <colorFore> 
} else { 
if (<transparent>) { 
<new_pixel_color> = <transparent> 
} else { 
<new_pixel_color> = <colorBack> 
} 
} 


if ( <pixel_position> == <repeat_count> ) { 

<pixel_position> = 0 

if (<bit_position> == <bit_mask_size>) { 
<bit_position> = 0; 

} else { 
<bit_position> = <bit_position> + 1 

} 

} else { 
<pixel_position> = <pixel_position> +1 


} 


lineStyle 


Stipple size 
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15:13 RESERVED 


23:16 Start position - fractional part 


28:24 Start position - integer part 
31:29 RESERVED 


lineStipple Register 


The line bit-mask register contains a mask that determines how lines will be drawn. Bits that are ones 
will be drawn with the color in the colorFore register. Bits that are zeros will be filled with the color in 
the colorBack register, or will not be filled, depending on the ‘transparent’ bit in the command register. 
The pattern in the bit mask can be set to repeat every 1-32 bits, as set by the bit-mask size part of the line 
style register. If the bit-mask size is set to less than 31, some of the bits of the line bit-mask will not be 
used, starting with the most-significant bit. For example, if the bit-mask size is set to 7, bits 0-7 of the 
lineStipple register will contain the line bit-mask. 


lineStipple 


bresenhamError registers 


These registers allows the user to specify the initial Bresenham error terms used when performing line 
drawing, polygon drawing, and stretch blts. The Bresenham error terms are signed values. 


Bit 31 of each registers determines whether or not the error term given in the lower bits will be used. If 
this bit is 0, the line and stretch blt engines will generate the initial error term automatically. If the bit is 
set to 1, the error term given in bits 16-0 will be used. If a bresenham error register is used, the register 
should be written with bit[31] set to 0 after completion of the operation, so that subsequent operations will 
not be affected. 


bresError0 can be used to set the initial error value for lines, for the left edge of a polygon, and for blt 
stretching along the y-axis. 


bresError1 can be used to set the initial error value for the right edge of a polygon, and for blt stretching 
along the x-axis. 


bresError0 
Signed Bresenham error term for stretch blt y, lines, and left polygon edges 


30:17 RESERVED 
Use the error term given in bits 16-0 


bresError1 
Signed Bresenham error term for stretch blt x and right polygon edges 


30:17 RESERVED 
Use the error term given in bits 16-0 
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Screen-to-screen Blt Mode 
Writing the launch area while in screen-to-screen blt mode results in a rectangle being copied from one 
area of display memory to another. The position of the source rectangle is given by the write to the launch 
area. The write to the launch area will be used to fill the sreXY register. 


screenBltLaunch 
Bit Description 


X position of the source rectangle 


15:13 RESERVED 
28:16 Y position of the source rectangle 
31:29 RESERVED 


Screen-to-screen Stretch Blt Mode 


Writing the launch area while in screen-to-screen blt mode results in pixels being copied from rectangle in 
display memory to another of a different size. The write to the launch area will be used to fill the sreXY 
register. The x and y direction bits do not apply to stretch blits. I-e., only top-down, left-to-right stretch 
blits can be done. 


stretchBltLaunch 
X position of the source rectangle 


15:13 RESERVED 
28:16 Y position of the source rectangle 
31:29 RESERVED 


Host-to-screen Blt Mode 


In host-to-screen blt mode, writes to the launch area should contain packed pixels to be used as source 
data. When performing a host-to-screen bit, the blt engine does not generate source addresses. However, 
it is still necessary for the driver to specify the srcFormat, in order for the blt engine to determine how 
the source data is packed. The driver must also write the srcXY register in order to specify the first byte 
or bit to use from the first dword. In monochrome source mode, the 5 Isbs will specify the initial bit. In all 
other modes, the 2 Isbs of sreXY will specify the initial byte of the initial span. The alignment of the first 
pixel of each span after the first is determined by adding the source stride (from the srcFormat register) 
to the alignment of the previous span. 


If more data is written to the launch area than is required for the host blt specified, the extra data will be 
discarded, or may be used in the following host blt, if it is requested while the 2D is operating on the first 
hbit. If too little data is written to the launch area, the hblt will be aborted, and pixels on an incomplete 
span at the end of the host blt may or may not be drawn. 
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Host BIt Example 1 


In this example, the driver is drawing text to a 1024x768x16bpp screen using monochrome bitmaps of 
various widths. The monochrome data is packed, with each row byte aligned. First, it sets up the 
necessary registers before giving the data specific to the first blt: 


colorBack <= the background color 

colorFore <= the foreground color 

dstXY <= the starting position of the first character 
dstBaseAddr <= base address of the primary surface 

clipOMin <= 0x00000000 

clipO0Max <= 0xFFFFFFFF 

command <= SRC_COPY || HOST_BLT_MODE = 0xCC000003 
dstFormat <= 0x00030800 

srcFormat <= 0x00400000 


The command mode is set to host-to-screen blt, with all other features disabled. Since colorkeying is 
disabled, only ROPO is needed. The format register sets the host format to unswizzled monochrome, 
using byte-packing. This means that the stride will not have to be set for each bit, but will be set to the 
number of bytes required to store the number of pixels in the source width (Since this is not a stretch bit, 
the source width equals the destination width, as set later in the dstSize register). The clip registers are 
set such that the results will not be clipped. Although this is a host to screen bit, the sreXY register must 
be set in order to specify the initial alignment of the bitmask. For this example, the source data begins 
with the Isb of the first dword of host data, so the sreXY register is set to zero. 


Now, the driver is ready to start the first blt. It will blt a 11x7 pixel character. 
dstSize <= 0x0007000B 
srcXY <= 0x00000000 
launch <= 0xc0608020 
launch <= 0xC460C060 
launch <= 0x3B806ECO 
launch <= 0x00001100 


Host Bit Example 2 


In this example, the driver is drawing a pixel map 
colorBack <= the background color 
colorFore <= the foreground color 
dstXY <= the starting position of the first character 
clipO0Min <= 0x00000000 
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clipOMax <= 0xFFFFFFFF 
command <= SRC_COPY || HOST_BLT_MODE = 0xCC000003 
srcFormat <= 0x00240000 


The command mode is set to host-to-screen blt, with all other features disabled. Since colorkeying is 
disabled, only ROPO is needed. The format register sets the host format to unswizzled monochrome, 
using byte-packing. This means that the stride will not have to be set for each blt, but will be set to the 
number of bytes required to store the number of pixels in the source width (Since this is not a stretch bit, 
the source width equals the destination width, as set later in the dstSize register). The clip registers are 
set such that the results will not be clipped. Although this is a host to screen bit, the sreXY register must 
be set in order to specify the initial alignment of the bitmask. For this example, the source data begins 
with the Isb of the first dword of host data, so the sreXY register is set to zero. 


Now, the driver is ready to start the first blt. It will blt a 11x7 pixel character. 
dstXY <= 0x0007000B 
sreXY <= 0x00000000 
launch <= 1* 2 rows 
launch <= 2" 2 rows 
launch <= 3" 2 rows 


launch <= last row 


hostBltLaunch 
Source pixel data 


Host-to-screen Stretch Blt Mode 


Writing the launch area while in host-to-screen blt mode results in the pixels written to the launch area 
being stretched onto the destination rectangle. Pixel data for Host-to-screen stretch blts is written just as 
for non-stretched host-to-screen blts, except when the destination height differs from the source height. In 
this case, the host must replicate or decimate the source spans to match the number of destinations spans 
required. 


hostStretchLaunch 
Source pixel data 


Rectangle Fill Mode 


Rectangle fill mode is similar to screen-to-screen blt mode, but in this mode, the colorFore register is 
used as source data rather than data from display memory. The size of the rectangle is determined by the 
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dstSize register. The write to the launch area gives the position of the destination rectangle, which is 
used to fill the dstXY register. 


rectFillLaunch 


Description 


12:0 X position of the destination rectangle 


28:16 Y position of the destination rectangle 


15:13 RESERVED 
31:29 RESERVED 


Line Mode 


Writing the launch area while in line mode will write the launch data to the dstXY register and draw a 
line from srceXY to dstXY. After the line has been drawn, dstXY is copied to sreXY. In line mode, all 
pixels in the line will be drawn (as specified by the line style register), including both the start and 
endpoint. 


The ROP used for lines can use the pattern and the destination, but not source data. colorFore will be 


used in the ROP in place of source data. Source colorkeying must be turned off, destination colorkeying is 
allowed. 


Line drawing example 


srceXY <= 0x00020003 // line start-point = (3, 2) 
lineStipple <= 0x00000006 // bit mask is 110 binary 
lineStyle <= 0x02010202 // start position = 2 1/3, repeat count = 2, bit-mask size=2 


colorBack <= BLACK 

colorFore <= GREY 

command <= LINE MODE || OPAQUE 

launch <= 0x000c0016 // line end-point = (22,12) 


The line drawn will appear as shown below: 
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Origin 


Figure | 


lineLaunch 
X position of the line endpoint 


15:13 RESERVED 
28:16 Y position of the line endpoint 
31:29 RESERVED 


Polyline Mode 


Writing the launch area while in line mode will write the launch data to the dstXY register and draw a 
line from sreXY to dstXY. After the line has been drawn, dstXY is copied to sreXY. In polyline mode, 
the endpoint of the line (the pixel at dstX Y) will not be written. This ensures that each pixel in a non- 
overlapping polyline will be written only once. 


The ROP used for lines can use the pattern and the destination, but not source data. colorFore will be 
used in the ROP in place of source data. Source colorkeying must be turned off, destination colorkeying is 
allowed. 


polylineLaunch 
X position of the line endpoint 


15:13 RESERVED 
28:16 Y position of the line endpoint 
31:29 RESERVED 


Polygon Fill Mode 


The polygon fill mode can be used to draw simple polygons. A polygon may be drawn using the method 
described below if no horizontal span intersects more than two non-horizontal polygon edges. Polygons 
are drawn by first determining the top vertex - that is the vertex with the lowest y coordinate. The 
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coordinates of this vertex should be written to the sreXY register. If multiple vertices share the lowest y 
coordinate, any vertex with the lowest y coordinate may be used as the starting point. If command[8] is 
set when the command register is written when command|[3:0] indicates polygon mode, the value in the 
srcXY register will be copied to the dstXY register. The value in the sreXY register determines the 
starting point for the left side of the polygon, while the value in the dstXY register determines the starting 
point for the right side of the polygon. If bit[8] of the command register is not set, the starting position of 
the right side of the polygon can be set by writing to the dstXY register. 


Once the starting vertex is set, as well as the desired colors, ROP, pattern, and options for the polygon fill, 
the polygon can be drawn by writing polygon vertices to the launch area. When multiple vertices share 
the lowest y coordinate, the starting vertex chosen will determine which of those vertices are on the ‘right’ 
edge of the polygon and which are on the ‘left’ edge. Pixels with the same y value as the starting point 
are on the left edge if they are to the left of the starting point. 


For optimum performance, software should determine the leftmost and rightmost of all vertices that share 
the lowest y coordinate. The coordinates of the leftmost vertex should be written to sreXY and the 
coordinates of the rightmost vertex should be written to dstXY. When the command register is written, 
command[8] (the ‘start command’ bit) should be low. 


In Polygon fill mode, polygon vertices should be written to the launch area in order of increasing y value. 
Whenever 2 vertices share the same y value, the leftmost vertex *must* be written first. The driver should 
keep track of the last y value sent for the left and right sides. If the y value for the last vertex sent for the 
left side is *less than or equal to* the last y value sent for the right side, the next vertex on the left side 
should be written to the launch area. Otherwise, the next vertex for the right side should be written to the 
launch area. 


The ROP used for filling polygons can use the pattern and the destination, but not source data. colorFore 
will be used in the ROP in place of source data. Source colorkeying must be turned off, destination 
colorkeying is allowed. 


Pixels that are on the line that forms the left edge of the polygon will be drawn. Pixels that fall on the line 
that forms the right edge of the polygon will not be drawn. For Horizontal edges, pixels on a horizontal 
polygon edge that is on the ‘top’ of the polygon (i.e. above the edge is outside the polygon and below the 
edge is inside the polygon) will be drawn, while pixels on a horizontal polygon edge that is on the bottom 
of the polygon will not be drawn. 


Polygon drawing example 


As an example of polygon drawing, say we are drawing the polygon shown in figure 2. Traversing the 
vertex list in counterclockwise order gives the following list of vertices: 


(4,1) (2,4) G, 6) C1, 6) (2,8) (5, 11) (8,8) (13,8) 11,6) (11,3) 0,1) 


Figures 2a through 2m show the steps in drawing the polygon. Filled circles are vertices of the left 
polygon edge. Open circles are vertices of the right polygon edge. Pixels that are drawn at the end of 
each step are shaded in the figures. 


The polygon engine keeps track of four vertices at a time. The top vertex of the current left polygon edge 
(LO), the bottom vertex of the current left polygon edge (L1), the top vertex of the current right polygon 
edge (RO), and the bottom vertex of the current right polygon edge (R1). The values of these variables at 
each step in drawing the polygon are shown in the figures. The arrows in the figures indicate when a 
variable changes between the start of the step and the end of pixel filling for that step. 
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Figure 2 


First, all required registers must be written, including the dstFormat register to specify the drawing 
surface, color or pattern registers, and the command register. Write the coordinates of the starting vertex 
(4, 1) to the sreXY register: 


srceXY <= 0x00010004 
command <= POLYGON_MODE || INITIATE_COMMAND 


LO © RO 
L1 R1 


Figure 2a 
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Rl.y>=L1.y, so we have to write the next vertex for the left edge (2, 4): 
launch <= 0x00040002 


L1 


Figure 2b 


R1.y<LI.y, so we write the next vertex for the right edge (10, 2). The drawing engine now has edges for 
both the left and right edges. So, it will draw all spans up to min(R1.y, Ll.y). Because R1.y=R0.y, no 
pixels will be drawn, but RO will be updated to vertex RI: 


launch <= 0x0001000a 


L1 


Figure 2c 


R1.y<L1.y, so we again write the next vertex on the right polygon edge (11, 3). Pixels on all spans from 
max(LO0.y, RO.y) to min(L1.y, R1.y)-1 will be drawn, as shown below. Because R1.y<L1.y, RO is updated 
to R1. 


launch <= 0x0003000b 
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LO RO 


L1 


Figure 2d 


R1.y<Ll.y, so we write the next vertex on the right edge (11, 6). Again, pixels on all spans from 
max(L0.y, RO.y) to min(L1.y, R1.y)-1 will be drawn. This time R1.y>L1.y, however, so LO is updated to 
Ll. 


launch <= 0x0006000b 


LO 
RO 
L1 


R1 


Figure 2e 


Rl.y>=L1.y, so we write the next vertex on the left edge (3, 6). L1.y=R1.y, so RO is updated to R1 and LO 
is updated to L1. 


launch <= 0x00060003 
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® O 
5 RO 
LO 
y O 
LO ° 0 RO 
L1 R1 
Figure 2f 


Rl.y>=L1.y, so we write the next vertex on the left edge (1, 6). L1.y=R1.y, so RO is updated to R1 and LO 
is updated to L1. R1 did not change, so updating RO to R1 has no effect. 


launch <= 0x00060001 


ear vii ° RO 
L1 R1 


Figure 2g 


Rl.y>=L1.y, so we again write the next vertex on the left edge (2, 8). Ll.y>R1.y, so RO is updated to R1, 
again with no effect. 
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launch <= 0x00080002 


LO RO 
R1 


L1 


Figure 2h 


R1.y<Ll.y, so we write the next vertex on the right edge (11, 8). Ll.y=R1.y, so RO is updated to R1, and 
LO is updated to LI. 


launch <= 0x0008000b 
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@ O 
O 
@ 
LO e e 0 RO 
LO Ps 5 RO 
L1 R1 
Figure 21 


Rl.y>=L1.y, so we write the next vertex on the left edge (5, 11). L1.y>R1.y, so RO is updated to R1. 


launch <= 0x000b0005 


Copyright © 1996-1999 3Dfx Interactive, Inc. 
58 


Revision 1.01 
Printed 03/01/99 


Voodoo Banshee Universal Access 2d Databook 


LO RO 
R1 


L1 


Figure 2) 


R1.y<LI.y, so we write the next vertex on the right edge (8, 8). Ll.y>RI1.y, so RO is updated to R1, but no 


pixels are drawn. 


launch <= 0x00080008 
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® O 
O 
O 
@ ® O 
LO RO<——————__RO 
@ O 
R1 
L1 © 
Figure 2k 


R1.y<Ll.y, so we write the next vertex on the right edge. This is the final vertex in the polygon, which 
doesn’t have a horizontal span at the bottom, so this vertex is the same as the last vertex for the left edge 
(5, 11). Ll.y=R1-y, so RO is updated to R1, and LO is updated to L1. No pixels on the final span are 
drawn (this would be true even if L1.x did not equal R1.x). If the launch area is written again before any 
registers are written the polygon engine will begin a new polygon starting at (5,11). 


launch <= 0x000b0005 
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Figure 2m 


polygonLaunch 
Bit Description 


X position of a polygon vertex 


15:13 RESERVED 
28:16 Y position of a polygon vertex 
31:29 RESERVED 


Miscellaneous 2D 


Write Sgram Mode Register 
Executing this command causes the value in srcBaseAddr[10:0] to be set as the sgram mode register via a 
special bus cycle in the memory controller. 


SGRAM mode register 
[3 | burst type (Q=sequential, I=interleave) 
|CAS latency 


3 
CAS latency 


test mode 
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pOt= = = write burst length (O=burst, 1=single bit). 


sgram-defined. 


Write Sgram Color Register 


Executing this command causes the value in srcBaseAddr[31:0] to be set as the sgram color register via a 
special bus cycle in the memory controller. Since H3 has a 128-bit wide bus, the register is replicated 
across the four sets of sgram memories. 


Write Sgram Mask Register 


Executing this command causes the value in srcBaseAddr[31:0] to be set as the sgram mask register via a 
special bus cycle in the memory controller. Since H3 has a 128-bit wide bus, the register is replicated 
across the four sets of sgram memories. 


6. 3D Memory Mapped Register Set 
Memory Base 0: Offset 0x0200000 


Register Name Address Reg | Bits Chip R/ Sync? Description 
Num WwW /Fifo? 


0x0000) | 0x0 | 31:0 | FBI | R__ | No/n/a__| Banshee Status 


0x004(4) Interrupt Status and Control 
nopCMD 0x120(288) FBI+TREX” Execute NOP command 


nopCMD Register 


Writing any data to the nopCMD register executes the NOP command. Executing a NOP command 
flushes the graphics pipeline. The Isb of the data value written to nopCMD is used to optionally clear the 
fbiPixelsIn, fbiChromaFail, fbiZfuncFail, fbiAfuncFail, and fbiPixelsOut registers. Writing a ‘1’ to 
the Isb of nopCMD will clear the aforementioned registers. Writing a ‘0’ to the Isb of nopCMD will not 
modify the values of the aforementioned registers. 


Description 


Clear fbiPixelsIn, fbiChromaFail, fbiZfuncFail, fbiAfuncF ail, and fbiPixelsOut 
registers (1=clear registers) 


IfbMode Register 


The IfbMode register controls linear frame buffer accesses. 


Linear frame buffer write format (see table below) 
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Reserved 


Enable Banshee pixel pipeline-processed linear frame buffer writes (1=enable) 
Linear frame buffer RGBA lanes (see tables below) 
11 
12 
13 
14 
15 
16 


16-bit word swap linear frame buffer writes (1=enable) 
Byte swizzle linear frame buffer writes (1=enable) 


ies = 4] LFB access Y origin (O=top of screen is origin, 1=bottom of screen is origin) 
[14 Ci Linear frame buffer write access W select (O=LFB selected, 1=zacolor[15:0]). 
jis | Reserved 


}16 | Reserved 


The following table shows the supported Banshee linear frame buffer write formats: 


Linear Frame Buffer Write Format 
2 
3 
4 
5 
7 


| | 16-bit formats 

O| Nobit RGB (5-6-5) 
}2 | obit ARGB1-5-5-5) 
3 Reserved 
a (Pa a a 
| | B2bit formats 
[40 | 2a bIERGB (888) 
[S| 32bLARGB 888-8) 
}7:6 | Reserved 


16-bit depth, 16-bit RGB (5-6-5) 


16-bit depth, 16-bit ARGB (1-5-5-5) 
16-bit depth, 16-bit depth 


When accessing the linear frame buffer, the cpu accesses information from the starting linear frame buffer 
(LFB) address space (see section 4 on Banshee address space) plus an offset which determines the <x,y> 
coordinates being accessed. Bits(3:0) of IfbMode define the format of linear frame buffer writes. 


6 
16-bit depth, 16-bit RGB (x-5-5-5) 
5 _ 16-bi 


When writing to the linear frame buffer, IfbMode bit(8)=1 specifies that LFB pixels are processed by the 
normal Banshee pixel pipeline -- this implies each pixel written must have an associated depth and alpha 
value, and is also subject to the fog mode, alpha function, etc. If bit(8)=0, pixels written using LFB access 
bypass the normal Banshee pixel pipeline and are written to the specified buffer unconditionally and the 
values written are unconditionally written into the color/depth buffers except for optional color dithering 
[depth function, alpha blending, alpha test, and color/depth write masks are all bypassed when bit(8)=0]. 
If bit(8)=0, then only the buffers that are specified in the particular LFB format are updated. Also note 
that if IfbMode bit(8)=0 that the color and Z mask bits in fozMode(bits 9 and 10) are ignored for LFB 
writes. For example, if LFB modes 0-2, or 4 are used and bit(8)=0, then only the color buffers are updated 
for LFB writes (the depth buffer is unaffected by all LFB writes for these modes, regardless of the status of 
the Z-mask bit fobzMode bit 10). However, if LFB modes 12-14 are used and bit(8)=0, then both the color 
and depth buffers are updated with the LFB write data, irrespective of the color and Z mask bits in 
fbzMode. If LFB mode 15 is used and bit(8)=0, then only the depth buffer is updated for LFB writes (the 
color buffers are unaffected by all LFB writes in this mode, regardless of the status of the color mask bits 
in fbzMode). 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
63 Printed 03/01/99 


er 


Voodoo Banshee Universal Access 2d Databook 


If IfbMode bit(8)=0 and a LFB write format is selected which contains an alpha component (formats 2, 5, 
and 14) and the alpha buffer is enabled, then the alpha component is written into the alpha buffer. 
Conversely, if the alpha buffer is not enabled, then the alpha component of LFB writes using formats 2, 5, 
and 14 when bit(8)=0 are ignored. Note that anytime LFB formats 2, 5, and 14 are used when bit(8)=0 
that blending and/or chroma-keying using the alpha component is not performed since the pixel-pipeline 
is bypassed when bit(8)=0. 


If IfbMode bit(8)=0 and LFB write format 14 is used, the component that is ignored is determined by 
whether the alpha buffer is enabled -- If the alpha buffer is enabled and LFB write format 14 is used with 
bit(8)=0, then the depth component is ignored for all LFB writes. Conversely, if the alpha buffer is 
disabled and LFB write format is used with bit(8)=0, then the alpha component is ignored for all LFB 
writes. 


If IfbMode bit(8)=1 and a LFB write access format does not include depth or alpha information (formats 
0-5), then the appropriate depth and/or alpha information for each pixel written is taken from the zaColor 
register. Note that if bit(8)=1 that the LFB write pixels are processed by the normal Banshee pixel 
pipeline and thus are subject to the per-pixel operations including clipping, dithering, alpha-blending, 
alpha-testing, depth-testing, chroma-keying, fogging, and color/depth write masking. 


Bits(10:9) of IfbMode specify the RGB channel format (color lanes) for linear frame buffer writes. The 
table below shows the Banshee supported RGB lanes: 


RGB Channel Format 
[O| ARGB 


RGBA 
BGRA 


Bit(11) of IfbMode defines the format of 2 16-bit data types passed with a single 32-bit writes. For linear 
frame buffer formats 0-2, two 16-bit data transfers can be packed into one 32-bit write -- bit(11) defines 
which 16-bit shorts correspond to which pixels on screen. The table below shows the pixel packing for 
packed 32-bit linear frame buffer formats 0-2: 


IfbMode bit(11) Screen Pixel Packing 
[O.—————— Right Pixel(host data 31:16), Left Pixel(host data 15:0) 
Left Pixel(host data 31:16), Right Pixel(host data 15:0) 


For linear frame buffer formats 12-14, bit(11) of IfbMode defines the bit locations of the 2 16-bit data 
types passed. The table below shows the data packing for 32-bit linear frame buffer formats 12-14: 


IfbMode bit(11) Screen Pixel Packing 


Z value(host data 31:16), RGB value(host data 15:0) 
RGB value(host data 31:16), Z value(host data 15:0) 


For linear frame buffer format 15, bit(11) of IfbMode defines the bit locations of the 2 16-bit depth values 
passed. The table below shows the data packing for 32-bit linear frame buffer format 15: 
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IfbMode bit(11) Screen Pixel Packing 
10 ~————s«Y|s Z Right Pixel(host data 31:16), Z Left Pixel(host data 15:0) 


Z left Pixel(host data 31:16), Z Right Pixel(host data 15:0) 


Note that bit(11) of IfbMode is ignored for linear frame buffer writes using formats 4 or 5. 


Bit(12) of IfbMode is used to enable byte swizzling. When byte swizzling is enabled, the 4-bytes within a 
32-bit word are swizzled to correct for endian differences between Banshee and the host CPU. For little 
endian CPUs (e.g. Intel x86 processors) byte swizzling should not be enabled, however big endian CPUs 
(e.g. PowerPC processors) should enable byte swizzling. For linear frame buffer writes, the bytes within a 
word are swizzled prior to being modified by the other control bits of IfhMode. When byte swizzling is 
enabled, bits(31:24) are swapped with bits(7:0), and bits(23:16) are swapped with bits(15:8). 


Very Important Note: The order of swapping and swizzling operations for LFB writes is as follows: byte 
swizzling is performed first on all incoming LFB data, as defined by IfbMode bit(12) and irrespective of 
the LFB data format. After byte swizzling, 16-bit word swapping is performed as defined by IfbMode 
bit(11). Note that 16-bit word swapping is never performed on LFB data when data formats 4 and 5 are 
used. Also note that 16-bit word swapping is performed on the LFB data that was previously optionally 
swapped. Finally, after both swizzling and 16-bit word swapping are performed, the individual color 
channels are selected as defined in IfbMode bits(10:9). Note that the color channels are selected on the 
LFB data that was previously swizzled and/or swapped 


Bit(13) of IfbMode is used to define the origin of the Y coordinate for all linear frame buffer writes when 
the pixel pipeline is bypassed (IfbMode bit(8)=0). Note that bit(13) of IfbMode does not affect rendering 
operations (FASTFILL and TRIANGLE commands) -- bit(17) of fbzMode defines the origin of the Y 
coordinate for rendering operations. Note also that if the pixel pipeline is enabled for linear frame buffer 
writes (IfbMode bit(8)=1), then fbzMode bit(17) is used to determine the location of the Y origin. When 
cleared, the Y origin (Y=0) for all linear frame buffer accesses is defined to be at the top of the screen. 
When bit(13) is set, the Y origin for all linear frame buffer accesses is defined to be at the bottom of the 
screen. 


Bit(14) of IfbMode is used to select the W component used for LFB writes processed through the pixel 
pipeline. If bit(14)=0, then the MSBs of the fractional component of the 48-bit W value passed to the 
pixel pipeline for LFB writes through the pixel pipeline is the 16-bit Z value associated with the LFB 
write. [Note that the 16-bit Z value associated with the LFB write is dependent on the LFB format, and is 
either passed down pixel-by-pixel from the CPU, or is set to the constant zaColor(15:0)]. If bit(14)=1, 
then the MSBs of the fractional component of the 48-bit W value passed to the pixel pipeline for LFB 
writes is zacolor(15:0). Regardless of the setting of bit(14), when LFB writes go through the pixel 
pipeline, all other bits except the 16 MSBs of the fractional component of the W value are set to 0x0. 
Note that bit(14) is ignored if LFB writes bypass the pixel pipeline. 


Linear Frame Buffer Writes 


Linear frame buffer writes -- format 0: 

When writing to the linear frame buffer with 16-bit format 0 (RGB 5-6-5), the RGB channel format 
specifies the RGB ordering within a 16-bit word. If the Banshee pixel pipeline is enabled for LFB 
accesses (IfbMode bit(8)=1), then alpha and depth information for LFB format 0 is taken from the 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
65 Printed 03/01/99 


ae 


wal 


Voodoo Banshee Universal Access 2d Databook 


zaColor register. The following table shows the color channels for 16-bit linear frame buffer access 
format 0: 


RGB Channel 16-bit Linear frame RGB Channel 
Format Value buffer access bits 


pO SO Reed (15:11), Green(10:5), Blue(4:0) 


Blue (15:11), Green(10:5), Red(4:0) 
Red (15:11), Green(10:5), Blue(4:0) 
Blue (15:11), Green(10:5), Red(4:0) 


Linear frame buffer writes -- format 1: 

When writing to the linear frame buffer with 16-bit format 1 (RGB 5-5-5), the RGB channel format 
specifies the RGB ordering within a 16-bit word. If the Banshee pixel pipeline is enabled for LFB 
accesses (IfbMode bit(8)=1), then alpha and depth information for LFB format | is taken from the 
zaColor register. The following table shows the color channels for 16-bit linear frame buffer access 
format 1: 


raaa| Value E oteceeee | access bits 
Pt [30___Tignored(13)_ Blue (14:10), Green(9:5), Red(4:0) 


Linear frame buffer writes -- format 2: 

When writing to the linear frame buffer with 16-bit format 2 (ARGB 1-5-5-5), the RGB channel format 
specifies the RGB ordering within a 16-bit word. If the Banshee pixel pipeline is enabled for LFB 
accesses (IfbMode bit(8)=1), then depth information for LFB format 2 is taken from the zaColor register. 
Note that the 1-bit alpha value passed when using LFB format 2 is bit-replicated to yield the 8-bit alpha 
used in the pixel pipeline. The following table shows the color channels for 16-bit linear frame buffer 
access format 2: 


RGB Channel 16-bit Linear frame RGB Channel 
he Value buffer access bits 


Alpha(15), Red (14:10), Green(9:5), Blue(4:0) 


OE Alpha(15), Blue (14:10), Green(9:5), Red(4:0) 
Red (15:11), Green(10:6), Blue(5:1), Alpha(0) 
Blue (15:11), Green(10:6), Red(5:1), Alpha(0) 


Linear frame buffer writes -- format 3: 
Linear frame buffer format 3 is an unsupported format. 


Linear frame buffer writes -- format 4: 

When writing to the linear frame buffer with 24-bit format 4 (RGB x-8-8-8), the RGB channel format 
specifies the RGB ordering within a 24-bit word. Note that the alpha/A channel is ignored for 24-bit 
access format 4. Also note that while only 24-bits of data is transfered for format 4, all data access must 
be 32-bit aligned -- packed 24-bit writes are not supported by Banshee. If the Banshee pixel pipeline is 
enabled for LFB accesses (IfpMode bit(8)=1), then alpha and depth information for LFB format 4 is taken 
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from the zaColor register. The following table shows the color channels for 24-bit linear frame buffer 
access format 4: 


RGB Channel 24-bit Linear frame RGB Channel 
Format Value buffer access bits 
(aligned to 32-bits) 


foi B80 Agnored(31:24), Red (23:16), Green(15:8), Blue(7:0) 


Linear frame buffer writes -- format 5: 

When writing to the linear frame buffer with 32-bit format 5 (ARGB 8-8-8-8), the RGB channel format 
specifies the ARGB ordering within a 32-bit word. If the Banshee pixel pipeline is enabled for LFB 
accesses (IfbMode bit(8)=1), then depth information for LFB format 5 is taken from the zaColor register. 
The following table shows the color channels for 32-bit linear frame buffer access format 5. 


RGB Channel 24-bit Linear frame RGB Channel 
Format Value buffer access bits 
(aligned to 32-bits) 


fO.™—~—SY ss Apha31:24), Red (23:16), Green(15:8), Blue(7:0) 


Linear frame buffer writes -- formats 6-11: 
Linear frame buffer formats 6-11 are unsupported formats. 


Linear frame buffer writes -- format 12: 

When writing to the linear frame buffer with 32-bit format 12 (Depth 16, RGB 5-6-5), the RGB channel 
format specifies the RGB ordering within the 32-bit word. If the Banshee pixel pipeline is enabled for 
LFB accesses (IfbMode bit(8)=1), then alpha information for LFB format 12 is taken from the zaColor 
register. Note that the format of the depth value passed when using LFB format 12 must precisely match 
the format of the type of depth buffering being used (either 16-bit integer Z or 16-bit floating point 1/W). 
The following table shows the 16-bit color channels within the 32-bit linear frame buffer access format 
12: 


RGB Channel 16-bit Linear frame RGB Channel 
Format Value buffer access bits 


Red (15:11), Green(10:5), Blue(4:0) 


Blue (15:11), Green(10:5), Red(4:0) 
Red (15:11), Green(10:5), Blue(4:0) 
Blue (15:11), Green(10:5), Red(4:0) 


Linear frame buffer writes -- format 13: 

When writing to the linear frame buffer with 32-bit format 13 (Depth 16, RGB x-5-5-5), the RGB channel 
format specifies the RGB ordering within the 32-bit word. If the Banshee pixel pipeline is enabled for 
LFB accesses (IfbMode bit(8)=1), then alpha information for LFB format 13 is taken from the zaColor 
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register. Note that the format of the depth value passed when using LFB format 13 must precisely match 
the format of the type of depth buffering being used (either 16-bit integer Z or 16-bit floating point 1/W). 
The following table shows the 16-bit color channels within the 32-bit linear frame buffer access format 
13: 


RGB Channel 16-bit Linear frame RGB Channel 
Format Value buffer access bits 


POS Tgnored(15), Red (14:10), Green(9:5), Blue(4:0) 


Ignored(15), Blue (14:10), Green(9:5), Red(4:0) 
Red (15:11), Green(10:6), Blue(5:1), Ignored(0) 
Blue (15:11), Green(10:6), Red(5:1), Ignored(0) 


Linear frame buffer writes -- format 14: 

When writing to the linear frame buffer with 32-bit format 14 (Depth 16, ARGB 1-5-5-5), the RGB 
channel format specifies the RGB ordering within the 32-bit word. Note that the format of the depth 
value passed when using LFB format 14 must precisely match the format of the type of depth buffering 
being used (either 16-bit integer Z or 16-bit floating point 1/W). Also note that the 1-bit alpha value 
passed when using LFB format 14 is bit-replicated to yield the 8-bit alpha used in the pixel pipeline. The 
following table shows the 16-bit color channels within the 32-bit linear frame buffer access format 14: 


RGB Channel 16-bit Linear frame RGB Channel 
rear, Value buffer access bits 


Alpha(15), Red (14:10), Green(9:5), Blue(4:0) 


— Alpha(15), Blue (14:10), Green(9:5), Red(4:0) 
Red (15:11), Green(10:6), Blue(5:1), Alpha(0) 
Blue (15:11), Green(10:6), Red(5:1), Alpha(0) 


Linear frame buffer writes -- format 15: 

When writing to the linear frame buffer with 32-bit format 15 (Depth 16, Depth 16), the format of the 
depth values passed must precisely match the format of the type of depth buffering being used (either 16- 
bit integer Z or 16-bit floating point 1/W). If the Banshee pixel pipeline is enabled for LFB accesses 
(lfbMode bit(8)=1), then RGB color information is taken from the color1 register, and alpha information 
for LFB format 15 is taken from the zaColor register. 


userIntrCMD Register 
Writing to the userIntrCMD register executes the USERINTERRUPT command: 


Description 


Wait for USERINTERRUPT to be cleared before continuing (1=stall graphics engine 
until interrupt is cleared) 


Wait for interrupt generated by USERINTERRUPT (visible in intrCtrl bit(11)) to be 
cleared before continuing (1=stall graphics engine until interrupt is cleared) 


User interrupt Tag 


If the data written to userIntrCMD bit(0)=0, then a user interrupt is generated (intrCtrl bit(11) is set to 
1). If the data written to userIntrCMD bit(1)=1, then the graphics engine stalls and waits for the 
USERINTERRUPT interrupt to be cleared before continuing processing additional commands. If no 
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USERINTERRUPT interrupt is set and the data written to userIntrCMD bit(1)=1, then the graphics 
engine will not stall and will continue to process additional commands. Software may also use 
combinations of intrCtrl bits(1:0) to generate different functionality. 


The tag associated with a user interrupt is written to userIntrCMD bits 9:2. When a user interrupt is 
generated, the respective tag associated with the user interrupt is read from IntrCtrl bits 19:12. 


If the USERINTERRUPT command does not stall the graphics engine (userIntrCMD(0)=1), then a 
potential race condition occurs between multiple USERINTERRUPT commands and software user 
interrupt processing. In particular, multiple USERINTERRUPT commands may be generated before 
software is able to process the first interrupt. Irrespective of how many user interrupts have been 
generated, the user interrupt tag field in intrCtrl (bits 19:12) always reflects the tag of last 
USERINTERRUPT command processed. As a result of this behavior, early tags from multple 
USERINTERRUPT commands may be lost. To avoid this behavior, software may force a single 
USERINTERRUPT command to be executed at a time by writing userIntrCMD(1:0)=0x3 and cause the 
graphics engine to stall until the USERINTERRUPT interrupt is cleared. 


Note that bit 5 of intrCtrl must be set to | for user interrupts to be generated — writes to userIntrCMD 
when intrCtrl(5)=0 do not generate interrupts or cause the processing of commands to wait on clearing of 
the USERINTERRUPT command (regardless of the data written to userIntrCMD), and are thus in effect 
“dropped.” 


Command Descriptions 


NOP Command 


The NOP command is used to flush the graphics pipeline. When a NOP command is executed, all 
pending commands and writes to the texture and frame buffers are flushed and completed, and the 
graphics engine returns to its IDLE state. While this command is used primarily for debugging and 
verification purposes, it is also used to clear the 3D status registers (fbiTriangles, fbiPixelsIn, 
fbiPixelsOut, fbiChromaFail, fbiZfuncFail, and fbiAfuncFail). Setting nopCMD bit(0)=1 clears the 
3D status registers and flushes the graphics pipeline, while setting nopCMD bit(0)=0 has no affect on the 
3D status registers but flushes the graphics pipeline. See the description of the nopCMD register in 
section 5 for more information. 


USERINTERRUPT Command 


The USERINTERRUPT command allows for software-generated interrupts. A USERINTERRUPT 
command is generated by writing to the userIntrCMD register. userIntrCMD bit(0) controls whether a 
write to userIntrCMD generates a USERINTERRUPT. Setting userIntrCMD bit(0)=1 generates a 
USERINTERRUPT. userIntrCMD bit(1) determines whether the graphics engine stalls on software 
clearing of the user interrupt. By setting userIntrCMD bit(1)=1, the graphics engine stalls until the 
USERINTERRUPT is cleared. Alternatively, setting userIntrCMD bit(1)=0 does not stall the graphics 
engine upon execution of the USERINTERRUPT command, and additional graphics commands are 
processed without waiting for clearing of the user interrupt. A identification, or Tag, is also associated 
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with an individual USERINTERRUPT command, and is specified by writing an 8-bit value to 
userIntrCMD bits(9:2). 


User interrupts must be enabled before writes to the userIntrCMD are allowed by setting intrCtrl 
bit(5)=1. Writes to userIntrCMD when intrCtrl bit(5)=0 are “dropped” and do not affect functionality. 
A user interrupt is detected by reading intrCtrl bit (11), and is cleared by setting intrCtrl bit(11)=0. The 
tag of a generated user interrupt is read from intrCtrl bits (19:12). See the description of the intrCtrl 
and userIntrCMD registers in section 5 for more information. 


Linear Frame Buffer Access 


The Banshee linear frame buffer base address is located at a 8 Mbyte offset from the memBaseAddr PCI 
configuration register and occupies 4 Mbytes of Banshee address. Regardless of actual frame buffer 
resolution, all linear frame buffer accesses assume a 2048-pixel logical scan line width. The number of 
bytes per scan line depends on the format of linear frame buffer access format selected in the IfbMode 
register. Note for all accesses to the linear frame buffer, the status of bit(16) of fozMode is used to 
determine the Y origin of data accesses. When bit(16)=0, offset 0x0 into the linear frame buffer address 
space is assumed to point to the upper-left corner of the screen. When bit(16)=1, offset 0x0 into the linear 
frame buffer address space is assumed to point to the bottom-left corner of the screen. Regardless of the 
status of fbzMode bit(16), linear frame buffer addresses increment as accesses are performed going from 
left-to-right across the screen. Also note that clipping is not automatically performed on linear frame 
buffer writes if scissor clipping is not explicitly enabled (fbzMode bit(0)=1). Linear frame buffer writes 
to areas outside of the monitor resolution when clipping is disabled result in undefined behavior. 


Linear frame buffer Writes 


The following table shows the supported linear frame buffer write formats as specified in bits(3:0) of 
IfbMode: 


| | G-bit formats 
pO | Gb RGB (5-6-5) 
—a| 


Pe 
P| BR -bit formats 
32-bit ARGB (8-8-8-8) 


When writing to the linear frame buffer with a 16-bit access format (formats 0-3 and format 15 in 
IfbMode), each pixel written is 16-bits, so there are 2048 bytes per logical scan line. Remember when 
utilizing 16-bit access formats, two 16-bit values can be packed in a single 32-bit linear frame buffer write 
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-- the location of each 16-bit component in screen space is defined by bit(11) of IfbMode. When using 
16-bit linear frame buffer write formats 0-3, the depth components associated with each pixel is taken 
from the zaColor register. When using 16-bit format 3, the alpha component associated with each pixel 
is taken from the 16-bit data transfered, but when using 16-bit formats 0-2 the alpha component 
associated with each pixel is taken from the zaColor register. The format of the individual color channels 
within a 16-bit pixel is defined by the RGB channel format field in IfbMode bits(12:9). See the IfbMode 
description in section 5 for a detailed description of the rgb channel format field. 


When writing to the linear frame buffer with 32-bit access formats 4 or 5, each pixel is 32-bits, so there 
are 4096 bytes per logical scan line. Note that when utilizing 32-bit access formats, only a single pixel 
may be written per 32-bit linear frame buffer write. Also note that linear frame buffer writes using format 
4 (24-bit RGB (8-8-8)), while 24-bit pixels, must be aligned to a 32-bit (doubleword) boundary -- packed 
24-bit linear frame buffer writes are not supported by Banshee. When using 32-bit linear frame buffer 
write formats 4-5, the depth components associated with each pixel is taken from the zaColor register. 
When using format 4, the alpha component associated with each pixel is taken from the zaColor register, 
but when using format 5 the alpha component associated with each pixel is taken from the 32-bit data 
transfered. The format of the individual color channels within a 24/32-bit pixel is defined by the rgb 
channel format field in IfbMode bits(12:9). 


When writing to the linear frame buffer with a 32-bit access formats 12-14, each pixel is 32-bits, so there 
are 4096 bytes per logical scan line. Note that when utilizing 32-bit access formats, only a single pixel 
may be written per 32-bit linear frame buffer write. If depth or alpha information is not transfered with 
the pixel, then the depth/alpha information is taken from the zaColor register. The format of the 
individual color channels within a 24/32-bit pixel is defined by the rgb channel format field in lfhMode 
bits(12:9). The location of each 16-bit component of formats 12-15 in screen space is defined by bit(11) of 
IfbMode. See the IfbMode description in section 5 for more information about linear frame buffer writes. 


Linear frame buffer Reads 


It is important to note that reads from the linear frame buffer bypass the PCI host FIFO (as well as the 
memory FIFO if enabled) but are blocking. If the host FIFO has numerous commands queued, then the 
read can potentially take a very long time before data is returned, as data is not read from the frame buffer 
until the PCI host FIFO is empty and the graphics pixel pipeline has been flushed. One way to minimize 
linear frame buffer read latency is to guarantee that the Banshee graphics engine is idle and the host 
FIFOs are empty (in the status register) before attempting to read from the linear frame buffer. 


Programming Caveats 


The following is a list of programming guidelines which are detailed elsewhere but may have been 
overlooked or misunderstood: 


Memory Accesses 


All Memory accesses to Banshee registers must be 32-bit word accesses only. Linear frame buffer 
accesses may be 32-bit or 16-bit accesses, depending upon the linear frame buffer access format specified 
in IfbMode. Byte(8-bit) accesses are only allowed to Banshee linear frame buffer. 


Determining Banshee Idle Condition 


After certain Banshee operations, and specifically after linear frame buffer acceses, there exists a potential 
deadlock condition between internal Banshee state machines which is manifest when determining if the 
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Banshee subsystem is idle. To avoid this problem, always issue a NOP command before reading the 
status register when polling on the Banshee busy bit. Also, to avoid asynchronous boundary conditions 
when determing the idle status, always read Banshee inactive in status three times. A sample code 
segment for determining Banshee idle status is as follows: 


[78 2 ae ee 2 ae he he he 2 2 ae he Ae 2 2 2 he he 28 2 2 he he 28 2 2 he he 26 2 2 ae he he 2s 2 2c he he 2s 2 oe he he 2s 2 oe fe he 2s 2 2k ie he ie 


* SST_IDLE: 

* returns O if SST is not idle 

* returns | if SST is idle 

Bye he 2 2 ae he he 2s 2 oe he he 2s 2 2 he he 28 2 2 he he fe 2 2 fe he Ae 2s 2 ae he he 2s 2 2 he he 2s 2 oe fe he 2h 2 2 ae he he 2 2 oe he ie 2s 2 2k / 


SST_IDLE() 
{ 


ulong j, 1; 


// Make sure SST state machines are idle 
PCI_LMEM_WR(NOPCMD, 0x0); 
i=0; 
while(1) { 
j= PCILMEM_RD(STATUS); 
if & SST_BUSY) 
return(0); 
else 
i++; 
if(i > 3) 
return(1); 


7. PLL Registers 


Phase Charge Clock Out 
Dector Pump 


Description 
Video Clock PLL 

GRX Clock PLL 

Mem Clock PLL 
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Genlock mode: in order for the register 3da (vga register) to reflect the status of vsync correct, 
vgaInit0[1] 


needs to be set 


PilCtrl registers 


Rit = ——Cirsdé 
110 si—s—~—iCY K, Post divider value 
M. PLL input divider 


N. PLL multiplier 
[16 SSSS=SY«STTest, = 0. 


1:0 
2 
16 


Frequency output of PLL’s is given: 
fout = 14.31818 * (N+ 2)/(M+2)/(2% K). 
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8. DAC Registers 


Register Name 1 Tf) Bits | R/W Description 
Address 


Ox4c-Ox4f Dac Mode 2:1 or 1:1 


dacAddr 0x50-0x53 | 8:0 | R/W | Dac pallette address 
Ox54-0x57 
Ox58-Ox5b_|mna_ | J 


dacMode 


De otion 
Dac Mode 2:1 or 1:1 
Enable DPMS on Vsync 


Force Vsync value. 


Enable DPMS on Hsync 
Force Hsync value. 


Control crce2 collection mode (see crc2 register 


dacAddr 
18:0 ——_—|_ Palette Address 


This is the 9 bit CLUT address used for programming the CLUT. Unlike the VGA mechanism, this 
address does not auto increment, but has access to the entire 512 entries in the CLUT. 


dacData 


This is the 24 bit RGB value at the index programmed into dacAddr. The color values are always stored 
with red in bits [23:16], green in bits [15:8] and blue in bits [7:0]. 
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. Video Registers(PCI]) 


Register Name 1 10) = Description 
Addr 


| vidMaxRgbDelta | 0x58 _| 23:0 | R/W__| Maximum delta values for video filtering _| 
[| vidinStatus x74 | 200 | R_ | Video In Status register 


initial error term 
| vidCurrentLine | 0x94 | 10:0 | R_ | Current Scanline 


vidOverlayStartCoords Ox9c 31:0 | R/W Start Surface Coordinates [31:28] Overlay 
Start Screen Coordinates 


vidOverlayEndScreenCoord Overlay End Screen Coordinates 
vidOverlayDudx Overlay horizontal magnification factor 


vidOverlayDudxOffsetSrcWid | Oxa8 31:0 | R/W Overlay horizontal magnification factor 
th initial offset (bit 18:0) 
Overlay source surface width (bit 31:19) 


vidOverlay Dvdy Overlay vertical magnification factor 


vidOverlayDvdyOffset Overlay vertical magnification factor initial 
offset 


| vidCurrOverlayStartAddr | Oxfe__| 23:0 | R___| Current overlay start address in use| 


vidMaxRgbDelta 


The vidMaxRgbDelta register specifies the maximum delta values allowed for a pixel’s color components 
to be filtered in the video filter (4x1 tap filter or 2x2 box filter). Each of the three neighbor pixels is 
compared with the center pixel, and if any of the RGB or YCbCr components exceed that of the center 
pixel by +delta or -delta, that color component will be replaced by that of the center pixel in the filter. The 
purpose of this is to prevent the high frequency pixels from being filtered in the tap or box filter. 

Putting 0x01 in each of the delta values minimizes the amount of filtering while Ox3f maximizes it. The 
value 0x0 is undefined. 
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In order to avoid stepping outside the source surface (3d or video surface), the tap and box filters uses the 
value programmed in vidOverlayEndScreenCoord to determine when it is getting to the right and bottom 
edges of the overlay window, and performs point sampling on the edges of the source surface. For a full 
screen 3d surface to be displayed with the box filter on a 640 x 480 resolution, vidOverlayEndScreenCoord 
need to be programmed with x=639 and y=479 for clamping to be performed properly. 


Description 


5:0 Maximum blue/V/Cr delta for video filtering (unsigned). Range from 0x1 to Ox3f. 0x0 
is undefined. 


is undefined. 
undefined. 


vidProcCfg Register 


The vidProcCfg register is the general configuration register for the Video Processor. It is written by the 
host upon reset only. 


}0 | 1: Video Processor on, VGA mode off; 0: Video Processor off, VGA mode on. 
2 


Interlaced video out enable. For Banshee, since interlaced video out is not supported, 
this bit should remain 0 all the time. 


Half mode. 0 = disabled. 1 = enabled where desktop stride is added every other lines. 
ChromaKeyEnable. 0 = off. 1 = on. 


ChromaKeyResultInversion: (0 = desktop transparent if desktop color matches or falls 
within the chroma-key color range; 1 = desktop transparent if desktop color does not 
match or fall within the chroma-key range) 


Desktop surface enable. 0 = do not fetch the desktop surface, 1 = fetch desktop surface 
Overlay surface enable. 0 = do not fetch the overlay surface, 1 = fetch overlay surface 


Video-in data displayed as overlay enable. 0 = do not display the video-in buffer directly 
as overlay. 1= use the video-in buffer address as the overlay start address (auto- 


Desktop clut bypass. 0 = do not bypass the clut in the RAMDAC, | = bypass the clut 
Overlay clut bypass. 0 = do not bypass the clut in the RAMDAC, 1 = bypass the clut 


Desktop clut select. 0 = use the lower 256 entries of the clut. 1 = use the upper 256 
entries. 


Overlay clut select. 0 = use the lower 256 entries of the clut. 1 = use the upper 256 
entries. 

Overlay horizontal scaling enable. 0=disabled. 1=enabled. 

Magnification factor determined by vidOverlayDudx. 

Overlay vertical scaling enable. 0=disabled. 1=enabled. 

Magnification factor determined by vidOverlayDvdy. 
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17:16 Overlay filter mode 
00: point sampling 
01: 2x2 dither subtract followed by 2x2 box filter (for 3d only) 
10: 4x4 dither subtract followed by 4x1 tap filter (for 3d only) 
11: bilinear scaling 


20:18 Desktop pixel format 
000: 8bit palettized 
001: RGB565 undithered 
010: RGB24 packed 
011: RGB32 
100: Reserved 
101: Reserved 
110: Reserved 
111: Reserved 
23:21 Overlay pixel format 
000: Reserved 
001: RGB565 undithered 
010: Reserved 
011: Reserved 
100: YUV411 
101: YUYV422 
110: UYVY422 
111: RGB565 dithered 


Reserved (must be 0) 
Reserved (must be 0) 


| 2X mode which refreshes two screen pixels per video clock. 0 = 1X mode, 1 = 2X 
mode. 


HW cursor enable. 0 = disabled, 1 = enabled. 
Reserved 
Reserved. 
Reserved. 


24 
25 
26 
27 
28 
29 
30 
31 


Backend deinterlacing for video overlay. 0 = No deinterlacing in the backend pipe. 1 = 
Backend deinterlacing (Bob method). Bob method displays either the even or odd frame 
at a time, and interpolates two interlaced lines to get the missing field. It is not 


supported in 2X mode. 


How to program for Backend deinterlacing (Bob method): 

The only thing this option effects is that when the video processor displays the even field, it adds 0.5 to 
the initial vertical offset (initial dvdy offset) used by the backend bilinear scaler. Everything else is the 
same. 

Since deinterlacing in the backend uses the bilinear scaler unit to interpolate between two interlaced lines, 
the host needs to enable bilinear filtering, overlay vertical scaling, overlay horizontal scaling, and set up 
the initial dvdy offset, dvdy, initial dudx offset and dudx correctly according to the desired magnification 
factor between the source video and displayed video. The suggested setting for the parameters for backend 
deinterlacing without horizontal magnification are: bilinear filter enable = 1, overlay vertical scaling 
enable = 1, overlay horizontal scaling enable = 0, initial dvdy offset = 0.25, dvdy = 0.5. Initial dudx offset 
and dudx are don’t cares. 

Backend deinterlacing is not supported for 2x mode (2-pixel per video clk mode) since bilinear filtering is 
not available in 2x mode. 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
77 Printed 03/01/99 


Voodoo Banshee Universal Access 2d Databook 


How does the hardware work in half-mode/ low-resolution mode? 

The video refresh has an internal register which stores the memory address of where a scanline starts in 
the desktop surface. At vertical retrace, this internal register is loaded with the value of 
vidDesktopStartAddr. When half-mode is disabled, a stride is added to this internal register at the end of 
every scanline to move to the next scanline. However, when half-mode is enabled, the stride is added at 
the end of every other scanline, i.e., at the end of scanline 1, 3, 5,7, ...... etc. As a result, each line of the 
desktop surface will be displayed twice, and the height of the video display will be double the height of the 
desktop surface. 


As one could see, the half-mode bit doubles the video display in the y-direction only. To double the 
number of pixels in the x-direction, one needs to half the video clock frequency. For example, to display a 
desktop surface of 320 x 240 on the monitor as 640 x 480 at 60 frames per second, a video clock of 
frequency 12.59MHz (25.175MHz / 2) is needed. Also, while the vertical VGA timing parameters are the 
same as those for 640 x 480, the horizontal parameters (e.g., Total number of horizontal pixels, width of 
Hsync, number of pixels in horizontal blank, ...... etc) need to be halved. With half of the video clock 
frequency and the horizontal timing parameters, the monitor will see a timing which is equivalent to a 
width of 640 pixels. Lastly, the video register, vidScreenSize, need to be programmed with x = 320 and y = 
480, and each pixel presented by the video refresh unit to the monitor will be displayed twice in the x- 
direction. 


Line doubling is implemented for the desktop surface only, and is not available for overlay surface and 
hardware cursor. Therefore if the overlay and the hardware cursor are enabled in half-mode, they will be 
doubled in the x-dimension while they y-dimension will remain the same. 


hwCurPatAddr Register 


The hwCurPatAddr register stores the starting address of two monochrome cursor patterns. Each pattern 
is a bitmap of 64-bit wide and 64-bit high (a total of 8192 bits). The two patterns are stored in such a way 
that pattern 0 always resides in the lower half (least significant 64-bit) of a 128-bit word and pattern 1 the 
upper half. In other words, each 128-bit word consists of one line from pattern 0 and one line from pattern 
1. At each horizontal retrace, the Video Processor checks to see whether the cursor location falls on the 
current scanline. If so, it fetches from the memory eight words of cursor patterns at a time. The eight 
words are then stored in the on-chip ram for use in the next eight scanlines. This reduces the number of 
memory accesses for cursor patterns from 64 to 8 times per screen refresh. Cursor patterns always reside 
in linear address space, and the linear stride is always 16 bytes. The video processor figures out the shape 
and color of the cursor for the current scanline according to the following table: 


ee from Pattern as from Pattern | Displayed cursor Displayed cursor 
(Microsoft window) (X11) 


fo Powe ore Seren Ca 


ee ce) Current Screen Color 
pif Current Screen color | HWCurCo 
NOT current screen color HWCurC1 


Bit 
Physical address of where the cursor pattern resides in the memory. 
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hwCurLoc Register 


The hwCurLoc register stores the x and y coordinates of the bottom right corner of the cursor. The 
coordinates are unsigned, and range from 0 to 2047. This allows a partial cursor to be displayed in all 
edges of the screen. 


Bit Description 
X coordinate of the bottom right corner of the cursor. Undefined upon reset. 
26:16 Y coordinate of the bottom right corner of the cursor. Undefined upon reset. 


hwCurCo0 Register 


The hwCurC0 register stores color 0 of the cursor. 


hwCurCl1 Register 


The hwCurCl1 register stores color | of the cursor. 


vidInFormat 


The VidInFormat register allows the host to specify the data format of the video-in data. 


Description 
poke 


(VMI only) Video-In data format 
110: 8bit YCbCr 4:2:2 (UYVY) 101: 8bit YCbCr 4:2:2 (YUYV) 
100: 8bit YCbCr 4:1:1 


(VMI only) Video-In de-interlacing mode. (0 = No deinterlacing applied to the video data 
coming in; | = Weave method deinterlacing, i.e. the video-in port will merge two consecutive 
VMI frames into one inside the frame buffer before signaling a frame is done in the vidStatus 
register.) 

(VMI/TV out master mode) Vmi_vsync_in polarity. (1=active low; 0=active high (default)) 
(VMI/TV out master mode) Vmi_hsync_in polarity. (1=active low; O=active high (default)) 


7 (VMI only) Vmi_vactive_in polarity. (1=active low; O=active high (default)) 
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(TV out only) G4 for posedge (1=Brooktree TV out support; 0O=Chrontel) 

1: Brooktree TV encoder samples at falling edge for the following data; 0: Chrontel TV encoder 
samples at rising ec for the following data 

Data[11] 

Data[10] 

] 


[ 
[9 
[ 
[ 
[ 
[ 
[ 
[ 
[ 
[ 


Data[0] 

1: Brooktree TV encoder samples at rising edge for the following data; 0: Chrontel TV encoder 
samples at falling edge for the following data 

Data[11] RO[7] 

Data[10] RO[6] 

aoe 


[ 
[9 
[ 
[ 
[ 
[ 
[ 
[ 
[ 
[ 


ae 


(VMI/ TV out) VideolIn interface configuration 

00: VideoIn Interface not used 

01: VMI (Also need to clear vidSerialParallelPort reg bit[29] to 0. This controls GPIO[1], which 
is used in the reference board to select the VMI device as the driver of some shared pins between 
VMI and TV out). 

10: VideoIn Interface reconfigured to digital TV out (Also need to set vidSerialParallelPort reg 
bit[1] to 1. This controls GPIO[1], which is used in the reference board to de-select the VMI 
device as the driver of some shared pins between VMI and TV out). 

11: Reserved 
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(VMI/TV out) TV Out Genlock enable. 

0: The VMI logic of the video controller uses vmi_pixclk_in as its clock while the remaining 
video logic uses a separate video clock from the on-chip PLL. For TV out mode, vmi_vsync and 
vmi_hsync are output of the video controller. Vmi_vactive is always output of the video 
controller. (TV encoder is the slave device.) 

1: Both the VMI logic and the remaining video logic use vmi_pixclk_in as their clock. For TV 
out mode, vmi_vsync and vmi_hsync are input of the video controller. Vmi_vactive is always 
output of the video controller.(TV encoder is the master device.) By setting bit 16 to 1, it allows 
Banshee to genlock to the clock of an external VMI device or TV encoder. 

(VMI/TV out) not_use_vga_timing signal (Timing signals include vert_exra, display_ena, 
vfrontporch_active, vbackporch_active, vblank, vga_blank_n, vga_vsync, vga_hsync) 

0: Use the timing signals supplied by the VGA. For VMI and TV out slave mode. 

1: Do not use the timing signals from the VGA. Timing signals are either supplied by the TV 
encoder (in its master mode) or generated internally by the video controller. For TV out master 
mode only. 

Reserved. Used to be Field detection timing. (1=leading edge of Vsync; O=trailing edge of Vsync 
(default)) 


VMI field detection 


Note that the polarity of the VMI Vsync, Hsync, and Vactive signals is programmable. The inactive going 
edge of the Vsync signal indicated whether the field is odd or even. If Hsync is active during the inactive 
going edge of Vsync, the field is even. If Hsync is inactive, the field is odd. 


vidInStatus 


The VidInStatus register allows the host to read the status of the video-in port, and implement manual 
buffer flipping for the video-in data. 


Description 


Even/odd field of the frame VMI just finishes drawing. 1=even; 0=odd. 
Video-in buffer VMI just finishes writing to. 
00=buffer 0 (as specified by vidInAddr0); 
01=buffer 1 (as specified by vidInAddr1); 
10=buffer 2 (as specified by vidInAddr2); 
11=No buffer is ready yet, video processor is still working on the first frame 


vidSerialParallelPort Register 


The vidSerialParallelPort register controls the chip’s I2C, DDC, GPIO, and the host port interface. 
Bit[17:0] of the register are shared between the I2C and GPIO interface. If VideoIn interface is configured 
to VMI (vidInFormat bit[15:14] == 2’b01), vidSerialParallelPort[17:0] are for VMI’s host port interface. 
If configured to TV out (vidInFormat bit[15:14] == 2’b10), the bits are used to control digital TV out’s 
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additional GPIO interface. This is in addition to the two GPIO pins (one input and one output), which are 
controlled by vidSerialParallelPort[30:29]. 


Since VMI, TV out, and ROM share pins for their interface, a pin can be input or output depending on 
which interface has control of the pin at that time. GPIO[0] (different from TV_out_GPIO[0]) is a 
hardwired output pin designed to be an output enable of the on-board tristate drivers. GPIO[0] is asserted 
low, when the VMI device has control of the shared pins, and is driving pixdata[7:0], vmi_rdy_n, and 
vmi_intreq_n as input to Banshee. GPIO[0] is pulled high, when either ROM or TV out controls the 
shared pins, and pixdata[7:0], vmi_rdy_n, and vmi_intreq_n are output of Banshee. 


GPIO[1] is software programmable, and is used to control the output enable of the on-board tristate 
drivers for vmi_pixclk, vmi_vsync, vmi_hsync, and vmi_vactive. These are the signals that should be 
continually driven by the external vmi device even when the ROM is using the shared pins (ROM does 
not use the vmi_pixclk, vmi_vsync, vmi_hsync, and vmi_vactive pins). Otherwise the internal state of the 
vmi controller in Banshee may be messed up. Vmi_cs_n cannot be used in lieu of GPIO[1] for this 
purpose because the chip select pin can be turned off by vmi parallel host interface enable bit (bit 0 
below). 


Description (VMI is enabled) (digital TV out interface is enabled) 
B 


| | | VMI host port interface | General purpose IO interface 
ee | VMI parallel host interface enable. Not used 

(O=off, 1=on); Default to 0 upon reset. 

|ModeA | ModeBO | 


aes ae 
VMI DS_N (Data TV out GPIO[0] (Output ONLY!) 
encoder Strobe) 
ae 
(Read/ Write_n) (Write) 
4 VMI VMI data VMI data RDY Not used 
DTACK_N (Data (Data Ready) 
Acknowledge) 
= disabled); Default to 1 upon reset. 
I 


17:14 | host/ VMI Address 
encoder 
(for bits 
14 and 


DDC interface DDC interface 


DDC port enable (0 = disabled, 1 = DDC port enable (0 = disabled, 1 = 
enabled) Default to 0 upon reset. enabled) Default to 0 upon reset. 

DDC DCK write (0 = DCK pin is driven | DDC DCK write (0 = DCK pin is driven 
low, 1= DCK pin is tri-stated) low, 1= DCK pin is tri-stated) 

When this pin is tri-stated, other devices | When this pin is tri-stated, other devices 
can drive this line, and the final state of can drive this line, and the final state of 
the pin is reflected in bit 26. Default to 1 | the pin is reflected in bit 26. Default to 1 
upon reset. upon reset. 
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20 host DDC DDA write (0 = DDA pin is driven 
low, 1= DDA pin is tri-stated) 

When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 27. Default to 1 


upon reset. 


Monitor | DDC DCK state (read only, 0 = low, 1 = 
tri-stated which means no device is 
driving this pin) 

Monitor DDC DDA state (read only, 0 = low, | = 
tri-stated which means no device is 
driving this pin) 

| | 
host 


host 


I2C port enable (0 = disabled, | = 
enabled) Default to 0 upon reset. 

I2C SCK write (0 = SCK pin is driven 
low, 1= SCK pin is tri-stated) 

When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 21. Default to 1 
upon reset. 


VMI/ I2C SCK state (read only, 0 = low, 1 = 
encoder tri-stated which means no device is 
driving this pin) 


VMI/ I2C SDA state (read only, 0 = low, 1 = 
encoder tri-stated which means no device is 
driving this pin) 


host I2C SDA write (0 = SDA pin is driven 
low, 1= SDA pin is tri-stated) 
When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 22. Default to 1 
upon reset. 
| 


VMI reset_n (1 = normal. 0 = reset VMI 
device.) Default to 0 upon reset. 


input only gpio GPIO[2] input 


VMI/ 
encoder 


21 
22 
24 
25 
26 
27 
29 
30 
31 


DDC DDA write (0 = DDA pin is driven 
low, 1= DDA pin is tri-stated) 

When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 27. Default to 1 
upon reset. 

DDC DCK state (read only, 0 = low, 1 = 
tri-stated which means no device is driving 
this pin) 

DDC DDA state (read only, 0 = low, 1 = 
tri-stated which means no device is driving 
this pin) 


I2C interface I2C interface 


I2C port enable (0 = disabled, 1 = enabled) 
Default to 0 upon reset. 

I2C SCK write (0 = SCK pin is driven 
low, 1= SCK pin is tri-stated) 

When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 21. Default to 1 
upon reset. 

I2C SDA write (0 = SDA pin is driven 
low, 1= SDA pin is tri-stated) 

When this pin is tri-stated, other devices 
can drive this line, and the final state of 
the pin is reflected in bit 22. Default to 1 
upon reset. 

I2C SCK state (read only, 0 = low, | = tri- 
stated which means no device is driving 
this pin) 

I2C SDA state (read only, 0 = low, 1 = tri- 
stated which means no device is driving 
this pin) 


output only gpio GPIO[1] output output only gpio GPIO[1] output 


F T 
input only gpio GPIO[2] input 


Not used Not used 
p31 | Notused —i—“‘*‘—sC~*drS Noted Cid 
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H3 Pinout sharing VMI, ROM and TV Out 3/3/98 


Notes: 

1. Rom access, VMI Video data/host port access and TV Out can only be performed separately 

2. The only exception to 1, is "bypass" mode, where the VMI device can stream data and timing info directly into 
the TV encoder, while H3 snoops the data to display on the RGB monitor, either windowed or full screen. 

3. This solution allows simulaneous RGB and TV out of the Windows desktop, simultaneous VMI and TV out for full 
screen VMI on the TV and windowed VMI on the RGB, but not windowed VMI on the TV. 

4. The type in the table below is reference to H3 

5. The programmability of the VMI or TV Encoder can be done via I2C, e.g. PAL mode, 

6. The TV encoder must be able to operate in Master mode where it supplies the clock, vsync,hsync,blank and H3 outputs 
clock_out(delayed version of clock_in) and synchronous data 

7. We must route a reference board to make sure the pin functions have been shared to provide a decent route 

8. The ROM cs_n is tied to GND, the oe_n and we_n are used to control read/write respectively 


Pin Function 
Pin Function Rom Digital TV out 
Pin Name Pin Number Access Pin Function VMI rising/falling edge 


ier a A RM co 
| pixdataat | | a ot trict» | in XT BRO out_ 
| pixdataaa2 | | Tt =r” | in = Bat out _ 
| pixdataas | | Tt =r | in = Govt _ 
| pixdataad | | at Sra | in = BGS out _ 
| pixdataas | | Tt ST v5CrgvCs*§ | in =~] BG out _ 
| pixdataas | | Tt =| VerCrece* | in =| BG? | out_| 
| pixdataa7 | | Tt Sri” | in = SB out _ 
| out | vadaro | tf BR out _ 
-—oul_{_vaddrt_{_out_{_@zin_{_out_ 
vmi_adr/a10 | out | vada | tt | are out 
Pets 1 — vag I ot eae rl put 
| vmicsn | | ~cANNoTuse! | | vmicsn | cout_| CANNOTUSE! |__| 
| vmirw ft mi rw vie | out | clock out | out_| 
| _vmids ais | S| AS Tout vmids nvmird n | out | Tv ouTGPIOO | out | 
|_vmi rdyfbusy | | 2 out vmi tack _n/vmi rdy n*] in | Tv OUT GPIO 1 | in/out _| 
vmihdata | S| 0 invout_ | vito invout | NoTUseD |__| 
| vmihdata | | it invout | vi td 1 ivout | NoTuseo |__| 
| vmihdatta | =| Se invout | vit 2 | ivout |  NoTUseD |__| 
| vmihdatta | Ss | | invout | vita 3 | invout | NoTuseo | 
| vmihdatta [| S| a invout | vit 4 ivout | noTuseo |__| 
| vmibdatta | S| 5 invout | vit 5S ivout | NoTUSED |__| 
| vmihdatta | | | invout | vit 6 | ivout | NoTUseD |__| 
| vmibdatta | S| | invout | mi td 7 | ivout | NoTUseD |__| 
| hsyng | S| oTuseD | syne Tin’ XS sync | invoutt_| 
| vsyneg | | NoTUsED || vsync pn | in’ | vsync | invout_| 
| blank | | SC oTuseD | Tank n | in’ XS] blank n | invoutt_| 
| pix ckin | S| ~—noruseo, | | vid ck in |] in =] clock in | in _ 
|_vmiinsertn | __[Can this be detected by Software instead of wastingapin?--PinRemoved | 
| vmiintegn | | At Tt mi into Tin’ =| TV OUT GPIO.2 | out _| 
| reset out__| S| ~—noruseo, | | reset | ot | reset | outt_| 
| romoen | | rome n | 0 | CANNoTUsE! | | CANNoTUSE! |_| 
| romwen | S| rom wen | o | CANNoTUSE! | | —CANNOTUSE! |_| 
| i2c ck | S| SC NoruseoD | it ok Tt Sit ck Tt 
| i2e data | S| =~ Noruseo> | | ik data | ivout | id data | in/out_| 
ap | mice nh | out | vice n out | vice n | out 
ap | vi sync oe n | out | vmisync oe n | out _| vmi sync oe n_| out __| 


* means the signal may be buffered from the VMI data bus to ensure that it is not driven during ROM accesses. 


Issues: 
1. Brooktree part does not support CCIR656 where a data is transferred on rising edges only with a 2X clock. 
The Brooktree part uses a 1X clock and pumps data on both edges. 
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vidInXDecimDeltas (for VMI downscaling Brensenham Engine)/ vidTvOutBlankHCount (for 
TV out master mode) 

If VideoIn Interface is configured to VMI mode (i.e., VidInFormat[15:14] == 2’b01), vidInXDecimDeltas 
bits [11:0] contain the width of the destination video-in surface (width of the video overlay stored in the 
frame buffer) in number of pixels. VidInXDecimDeltas bits[27:16] contain the width difference between 
the source video-in surface (from VMI port) and destination video-in surface in number of pixels (Source - 
Destination) 


Description 


The positive (unsigned) value added to the error term when the horizontal Bresenham error 
term is <0. It is programmed to be the width of the destination video-in surface in number 
of pixels. 


15:12 


The positive (unsigned) value added to the error term when the horizontal Bresenham error 
term is >0. It is programmed to be the difference between the width of the source and 
destination video-in surfaces. (Source - Destination) in number of pixels. 


31:28 


If VideoIn Interface is configured to digitial TV out (i.e., VidInFormat[15:14] == 2’b10), TV Out Genlock 
is enabled (VidInFormat[16] == 1’b1), and Not_use_vga_timing_signal is asserted, vidInXDecimDeltas 
bits[10:0] contains the number of clock cycles after the leading edge of vmi_hsync before the horizontal 
active region starts (i.e., horizontal blank becomes deasserted). 


VidInXDecimDelata bits[26:16] contains the number of clock cycles after leading edge of vmi_hsync 
before the horizontal active region ends (i.e., horizontal blank is re-asserted). 


Output blank_n == horizontal blank_n AND vertical blank_n. 


Note that the value in bits[26:16] needs to be greater than bits[10:0]. The clock cycles are based on the 
clock coming in through the vmi_pixclk pin. 


region starts (i.e., horizontal blank becomes deasserted). 


26:16 The number of clock cycles after leading edge of vmi_hsync before the horizontal active 
region ends (i.e., horizontal blank is re-asserted). 


31:27 


vidInDecimInitErrs 


Description 
The signed (2’s complement) initial value of the error term in the horizontal Bresenham 
accumulator 


reserved 


The signed (2’s complement) initial value of the error term in the vertical Bresenham 
accumulator 


reserved 
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vidInY DecimDeltas 


If VideoIn Interface is configured to VMI mode (i.e., VidInFormat[15:14] == 2’b01), vidInYDecimDelta 
bits[11:0] contain the height of the destination video input window (height of the video overlay stored in 
the frame buffer) in number of lines. vidInYDecimDeltas contains the height difference between the 
source video surface (from VMI port) and destination video input window in number of lines (Source - 
Destination). 


Bit Description 
The positive value added to the error term when the vertical Bresenham error term is <0 


15:12 reserved 


27:16 The positive value added to the error term when the vertical Bresenham error term is >0 


31:28 


If VideoIn Interface is configured to digitial TV out (i.e., VidInFormat[15:14] == 2’b10), and TV Out 
Genlock is enabled (VidInFormat[16] == 1’b1), vertical blank_n signal is de-asserted when the number of 
positive edges of vmi_hsync after the positive edge of vmi_vsync == vidInYDecimDeltas bits[10:0]. 


Vertical blank_n signal is re-asserted when the number of positive edges of vmi_hsync after the positive 
edge of vmi_vsync == vidInYDecimDelta bits[26: 16]. 


Output blank_n == horizontal blank_n AND vertical blank_n. 


Note that the value in bits[26:16] needs to be greater than bits[10:0]. The clock cycles are based on the 
clock coming in through the vmi_pixclk pin. 


The number of vmi_hsync LEADING edges after the LEADING edge of vmi_vsync before 
the vertical active region starts (1.e., vertical blank becomes deasserted). 


15:11 
26:16 The number of vmi_hsync LEADING edges after the LEADING edge of vmi_vsync before 
the vertical active region ends (i.e., vertical blank is re-asserted). 


31:27 


Bresenham scaler for scaling down a video window in the horizontal direction: 
error = vidInXDecimInitErr; 
repeat until the source pixels of a video window scanline are exhausted 
if (error < 0) 
move to next source pixel 


error = error + vidInXDecimDeltal 


else 
select the current source pixel as the destination pixel 
move to next source pixel 
error = error - vidInXDecimDelta2 
Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 


86 Printed 03/01/99 


Aa 
e 
ae 


Voodoo Banshee Universal Access 2d Databook 


Bresenham scaler for scaling down a video window in the vertical direction: 
error = vidInYDecimInitErr 
at each VideoIn Hsync 
if (error < 0) 
skip the whole line of video in data 


error = error + vidInYDecimDeltal 


else 
select the current line of video in data 
error = error - vidInYDecimDelta2 
vidPixelBufThold 


The vidPixelBufThold determines how many empty slots in each of the three pixel buffers will trigger 
refilling of the buffers. 


Primary pixel buffer low watermark (0x0 — | empty slot; Ox3f — 64 empty slots) 


Secondary pixel buffer 0 low watermark (0x0 — 1 empty slot; 0x3f — 64 empty slots) 
17:12 Secondary pixel buffer 1 low watermark (0x0 — 1 empty slot; Ox3f — 64 empty slots) 


vidChromaKey Min Register 


The vidChromaKeyMin register contains the lower bound of the chroma key color. 


| * 8-bitdesktop colorformat 
| SC Shit desktop color format 
| i 16-bit desktop color format 
| | 2A-bit desktop color format 
| SS *[ 32bit desktop color format 
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7:0 Blue value of the chroma -key 


Green value of the chroma -key 


23: 16 Red value of the chroma -key 


31:24 Reserved 


vidChromaKey Max Register 


The vidChromaKey Max register contains the upper bound of the chroma key color. It is the same as 
vidChromaKeyMin if the chroma-key is a single color instead of a range. 


Bit Description 


Format same as vidChromaKeyMin Register 


vidCurrentLine Register 


The vidCurrentLine register contains the current scan out line. As the vertical beam scans down the 
display this register is incremented. 


Bit 
Current Video scan line. 


vidScreenSize 


NOTE: Whenever the screen resolution is changed, video processor needs to be re-enabled by 
clearing vidProcCfg bit 0 and setting it to 1. This will reset the video processor. 


[Bit | Description 


11:0 Width of the screen in number of pixels. If vidScreenX is specified to be bigger than 
1280, 2x mode needs to be enabled. 


23:12 Height of the screen in number of lines. 


vidOverlayStartCoords 


The x-coordinate on the screen where the upper left corner of the overlay locates. 
23:12 The y-coordinate on the screen where the upper left corner of the overlay locates. 


The lower two bits of the x-coordinate for the first pixel (at the upper left corner) of the 
overlay window with respect to the beginning of the source surface. Since the overlay 
window may be partially occluded by the dimension of the screen, the first pixel of the 
window may not necessarily be the first pixel of the source surface. The lower two bits 
of the x-coordinate are used for undithering. 


The lower two bits of the y-coordinate for the first pixel (at the upper left corner) of the 
overlay window with respect to the beginning of the source surface. Since the overlay 
window may be partially occluded by the dimension of the screen, the first pixel of the 
window may not necessarily be the first pixel of the source surface. The lower two bits 
of the y-coordinate are used for undithering. 


31:28 
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vidOverlayEndScreenCoord 


Beware that for a full screen overlay window, for example in 640x480 resolution, the 
vidOverlayEndScreenCoord should be programmed to be 639x479 since the screen coordinate system 
starts at the upper left orgin as (0,0). 


Bit Description 
The x-coordinate on the screen where the lower right corner of the overlay locates. 


23:12 


vidOverlayDudx 


The y-coordinate on the screen where the lower right corner of the overlay locates. 


When setting the vidOverlayDudx and vidOverlayDvdy, one needs to caution that the video refresh unit 
does not step outside the 3d or video surface when it gets to the rightmost and bottom edges of the overlay 
window. Since the video refresh unit does not know the dimension of the 3d or video surface, if 
vidOverlayDudx and vidOverlayDvdy are set too large, the refresh unit may step outside the source 
surface and cause artifacts at the edges of the overlay window. 


Step size in source per horizontal step in screen space for magnification. Format is 0.20. 


vidOverlay DudxOffsetSrc Width 


Bit[31:19] specifies the number of bytes of pixels that need to be fetched from the frame buffer to cover a 
line of overlay window. The value depends on the width of the overlay window, the pixel depth of the 
overlay, and the x-scaling factor, and the vidOverlayDudxOffset. The vidOverlayDuDxOffset will affect 
the value only by +/- 1 pixel. The easiest way to figure out the value for the source width is to divide the 
number of pixels for the width of the overlay window by the x-scaling factor. Round the result up. Add 1 
to adjust for any DudxOffset, and finally multiply the value by the overlay pixel depth. This is a 
conservative way to estimate for the source width since it will give a value slightly bigger than the actual 
number of bytes that are needed. Putting in a value which is smaller or grossly larger than the actual 
number of bytes needed will cause serious artifacts. 


Initial offset of Dudx. Format is 0.19. 


Number of bytes needed to be fetched from the source surface in order to cover a whole 
un-occluded scanline for the overlay (13 bits allows a max of 8K bytes for an overlay 


scanline). 

1.e., (Overlay width in number of screen pixels * vidOverlayDudx) + 
vidOverlayDudxOffset)) * overlay pixel depth in bytes. 

For non-scaled overlay with no offset, vidOverlayDudx becomes |, and 
vidOverlayDudxOffset becomes 0 in the above equation. 


vidOverlayDvdy 


When setting the vidOverlayDudx and vidOverlayDvdy, one needs to caution that the video refresh unit 
does not step outside the 3d or video surface when it gets to the rightmost and bottom edges of the overlay 
window. Since the video refresh unit does not know the dimension of the 3d or video surface, if 
vidOverlayDudx and vidOverlayDvdy are set too large, the refresh unit may step outside the source 
surface and cause artifacts at the edges of the overlay window. 
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Bit Description 


Step size in source per vertical step in screen space for magnification. Format is 0.20. 
vidOverlayDvdy Offset 
Bit Description 

Initial offset of Dvdy. Format is 0.19. 
Example: 


Given source size of 640 x 240 and have it magnified to 1024 x 768 on the screen. 


Source width: 
Dudx[31:19] = 640 X 2 bytes = 1280 bytes (here 16 bpp assumed) 
= 500h 


Dudx[19:0] = 640/1024 = 0.625 = a0000h 
(Note format is 0.20 means 
XXXXXXXXXXXXXXXKXKXKXKX 


Dvdy[19:0] = 240/768 = 0.3125 = 0.25 + 0.0625 = 50000h 
(Format same as dudx above) 


Dudx Offset[18:0] and Dvdy Offset [18:0] = 00000h if no initial offset is needed. 
If upper leftmost overlay pixel needs to be the center of 

the first pixel of the overlay surface, both offsets needs to be set to 0.5 

which is 40000h. 


vidDesktopStartAddr 


Physical starting address of the desktop surface. This is a byte-aligned address. 


vidDesktopOverlayStride 


14:0 Bit[14:0] contains the linear stride of the surface in bytes. If interlaced video output 
mode is enabled, the linear stride is still programmed to 1x the regular stride of the 


surface, and will be multiplied by 2 when used. 
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For video overlay, the stride needs to be a multiple of 4-bytes for YUV 422 pixel format and a multiple 
of 8-bytes for YUV 411 pixel format. This ensures that the right edge of the video source surface to fall 
on a boundary of 2 pixels for YUV 422 and 4 pixels for YUV 411. The start address for the overlay is 
sampled from the FIFO’ed leftOverlayBuf and rightOverlayBuf registers. The start address needs to be 
aligned on a 32-bit boundary for YUV 422 pixel format and a 64-bit boundary for YUV 411 pixel 
format. 


Description 
Bit[30:16] contains the linear stride of the overlay surface in bytes. If interlaced video 


output mode is enabled, the linear stride is still programmed to 1x the regular stride of 
the surface, and will be multiplied by 2 when used. 


vidInAddr0 


Bit 
Starting address of video-in buffer 0 


vidInAddr1 


Bit 
Starting address of video-in buffer 1 


vidInAddr2 


Bit 
Starting address of video-in buffer 2 


vidInStride 


This register contains the linear stride of the buffer in bytes. If interlaced video input 


mode is enabled, the linear stride is still programmed to Ix the regular stride, and will 
be multiplied by 2 before used. 


vidCurrOverlayStartAddr 


The vidCurrOverlayStartAddr register allows the host to read the start address which the video processor 
is using to refresh the overlay window for the current frame. 


Description 
23:0 Start physical address the video processor is using to refresh the overlay window. Read only. 
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Video-In Interface 
Function 


Video In Processor supports several connector interfaces for video data input. The following table shows 
the signals needed for each interface. 


Signals 


| VMI Video Port | | Hsyncin | Hsyncin | Hsyncin 
| VMI Video Port | | Vsyncin [| Vsyncin | Vsyncin 
| VMI Video Port | | Vactive | Vactiive 
P| Pixclk Gin) | Pixclk (in) | Pixclkin) 
i ee es 

DI 

A[3 


<si<l< 


MI Video Port 
MI Video Port 
MI I2C Port 
MI I2C Port 
MI Host Port 


MI Host Port | 


: 


[SCK(inout) | | 
Po 7:0) (infout) | DITO] 
PC AL3:0} (out) LAZO} 
MI Host Port Pp fecsn(out) cs 
MI Host Port Pods in Gout) rd 
MI Host Port Po win out) twin 
|VMIHost Port | | tack in) | ready 
DDC Port |SDA (in/out) | | 
DDC Port | SCK (in/out) | | 
| System signals | | vmi_reset_n (out) _[ vmi_reset_n (out) _| vmi_reset_n (out) | 


System signals vmi_int_n vmi_int_n 


System signals | | vmi_present_n (in) 


A. Video-In Interface: 


<1<|l<i<|<i<l< 


< 


General Description 


When video data arrives through the Video-In interface, they undergo the optional decimation and 
filtering, packed into words of 128 bits in a FIFO before written into the memory. As writes to the memory 
is always aligned on a 128-bit boundary, the appropriate byte enables also need to be set with the writes. 
Supported pixel formats for the video-in data are YUV422 and YUV411. Both pixel formats are stored in 
a form of 16 bit per pixel, which means that 4 bit are unused per pixel in the case of YUV411. 


Video data are stored in the Video-In frame buffers whose starting addresses are specified by the registers 
VidInAddr0, VidInAddr1, and VidInAddr2. VidInAddr1 and VidInAddr2 are used for double and triple 
buffering to avoid video tearing. However, since video is coming in at a different rate from the video 
refresh, switching of the video-in drawing buffers is not synchronous to the Vsync of the video refresh. At 
the end of each VMI frame, the vmi_int input signal will be asserted. The video processor will then switch 
to the next video-in frame buffer for the next VMI frame if multiple buffering is enabled. If disabled, the 
same video-in frame buffer will be overwritten. At the same time, the video processor also updates the 
VidInStatus register which indicates the VMI buffer VMI just finishes drawing (0, 1, 2), and whether the 
buffer contains even or odd field. An interrupt signal will signal the host for display buffer flipping for the 
video-in data. On the other hand, if the ““Video_in data displayed as overlay enable” bit in VidProcCfg is 
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set, the video porcessor will do the display buffer flipping automatically for the overlay provided that all 
the corresponding configuration registers for the overlay is set up correctly (e.g., overlay surface enable, 
overlay pixel format, overlay_dudx, ...... etc). 


If Weave video-in deinterlaced mode is enabled, the video processor detects even/odd field from 
VREF(Vsync) and HREF(Hsync). If odd, the specified VidInAddr register will be used as the starting 
address of the video-in frame buffer. If even, VidInStride will be used as the starting address offset, and 
added to the specified VidInAddr. Video-in buffer will be switched at every other Vsync. VidInStride 
should be programmed to contain the a value which equals to 1X the regular line stride regardless of 
whether the video-in data is interlaced or not. 


1. VMI 
-data: 


8-bit YCbCr interface is used. The data format is CCIR-656 YCbCr 422, and pixels arrive in the style of 
(CbO0[7:0] or UO[7:0]) -> YO[7:0] -> (CrO[7:0] or VO[7:0]) -> Y1[7:0]. 


Video data may be interlaced. 
-timing: 
Timing signals include VREF, HREF, VACTIVE, and PIXCLOCK. 


VREF and HREF are active high VSYNC and HSYNC. If HREF is high during the falling edge of VREF, 
the field is even. If HREF is low at that time, the field is odd. 


VACTIVE is a blanking signal which indicates pixel data is valid across the YCbCr bus. 


Video Limitation 


1. In 1x mode, 3 streams of pixel fetching will consume more memory bandwidth than available for 32- 
bit desktop. This means chroma-keying and bilinear filtering cannot be turned on simultaneously for 
32-bit desktop. 

2. In 2x mode (for any display larger than 1280 X 1024) where we refresh 2 screen pixels per 
cycle at 110MHZ, bilinear filtering is not supported. All backend zoom (magnification) is done 
by point sampling (replication). 


3. 1-10X backend zoom (magnification) with increments of 0.1X. Larger magnification is 
supported, but with bigger increments. 


1 to 1/16X video-in decimation (minimization) with increments of 0.015X. 


4. Retain the 3-bit tap filter for RGB565 dithered as an alternative 
to the 2x2 box filter. 


5. Interlaced video output mode is not implemented. 


6. Hw cursor is 2 color only. 
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7. YUV 411 pixel format will be stored as unpacked in the frame buffer. This means each pixel 
will occupy 16 bits instead of 12 bits. This makes pixel extraction easier, but consumes 
more memory. 


8. Video with YUV 422 format needs to be stored on a 4-byte memory boundary while YUV 411 
on a 8-byte boundary. This is necessary because UV are shared between 2 pixels in 422 
while UV are shared between 4 pixels in 411. 


10.AGP/CMD Transfer/Misc Registers 
Memory Base 0: Offset 0x0080000 


pe = GB 


agpGraphicsStride 0x010(16) Graphics stride 
agpMove CMD 0x014(20) Begin AGP transaction 


Po Mise 
yuvBaseAddress 0x100(256) YUV planar base address 
0x104(260) Y, U and V planes stride value 


agpReqSize 
agpReqSize defines the AGP packet transfer size. The maximum transfer size is 4-Mbyte block of data. 
This register is read write and has no default value. 


reserved. Default is 0x0. 


agpHostAddressLow 


During AGP transfers this address defines the source address bits 31:0 of AGP memory to fetch data 
from. AGP addresses are 36-bits in length and are byte aligned. The upper 4 bits reside in the 
agpHostAddressHigh register. This register is read write, and defaults to 0. 


Lower 32 bits of AGP memory. Default is 0x0. 


agpHostAddressHigh 
The agpHostAddressHigh defines the stride, width, and upper 4-bits of source AGP address, during 
AGP transfers. Stride and width are defined in quadwords. This register is read write, and defaults to 0. 
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Bit Description 


AGP Width. Default is 0x0. 


27:14 AGP Stride. Default is 0x0. 


31:28 Upper 4 bits of AGP memory. Default is 0x0. 


agpGraphicsAddress 

agpgraphicsAddress defines the destination frame buffer address and type of the AGP transfer. At the 
beginning of an AGP transfer this address is loaded into an internal address pointer that increments for 
each data received over AGP. This register is read write, and defaults to 0. 


Bit 
Frame buffer offset. Default is 0x0. 


agpGraphicsStride 


agpGraphicsStride defines the destination stride in bytes of the AGP transfer. Stride is in multiples of 
bytes. This register is read write, and defaults to 0. 


Bit 
Frame buffer Stride. Default is 0x0. 


agpMoveCMD 


agpMoveCMD starts an AGP transfer. When started agpHostAddress is loaded into the source pointer 
and agpGraphicsA ddress is loaded into the destination pointer. The source pointer is incremented after 
data is fetched from AGP memory and written into frame buffer memory addresses by the destination 
pointer. The destination pointer is then incremented after the data has been written. This register is write 
only and has no default. 
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Description 
reserved 
Dest memory type (O=Linear FB, 1=planar YUV, 2=3D LFB, 3 = texture port). 


Default is 0x0. 
Command stream ID. This bit defines which command fifo when using a host initiated 
AGP data move. Default is 0x0. 


yuvBaseAddress 


yuvBaseAddress register contains the starting frame buffer location of the yuv aperture. 


Bit 
YUV base Address. Default is 0x0. 


yuvsStride 


yuvsStride register contains the destination stride value of the U and V planes. 


11. AGP/PCI Configuration Register Set 


[Vendo ID [0 | 15:0 Dx Interactive Vendor Identification 
PStatus 6 SO | PC device status 
[Revision ID [8 | 7:0 | Revision Identification 
|Classcode | 9 | 23:0 | Generic functional description of PCI device 

| Reserved 
[Reserved | 5659 | Reserved 
[Interruptline [60 | 7:0 | Interrupt Mapping 
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[Reserved | 68-75 | Reserved 


|cfgScratch | 80 30 | Scratch pad register 
PACPICapID [96 | 31:0 | ACPI Capability identifier register (read only) | 


Vendor_ID Register 


The Vendor_ID register is used to identify the manufacturer of the PCI device. This value is assigned by 
a central authority that will control issuance of the values. This register is read only. 


3Dfx Interactive Vendor Identification. Default is Ox121a. 


Device_ID Register 


The Device_ID register is used to identify the particular device for a given manufacturer. This register is 
read only. 


Bit 
Banshee Device Identification. Default is 0x3. 


Command Register 


The Command register is used to control basic PCI bus accesses. See the PCI specification for more 
information. Bit 0,1 and 5 are R/W, and bits 15:6 and 4:2 are read only. 


pO | WO Access Enable. DefaultisO. 
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Status Register 


The Status register is used to monitor the status of PCI bus-related events. This register is read only and 
is hardwired to the value 0x0. 


Description 
Reserved. Default is 0x0. 


New Capabilities (AGP/ACPI). Default is 1 for AGP/ACPI (Strapped) 
66 Mhz Capable. Default is 0 for PCI 33 Mhz 1 for AGP (Strapped) 


[6s UDF supported. Default is 0. 


Fast Back-toBack capable. Default is 0. (Strapped) 


Revision_ID Register 


The Revision_ID register is used to identify the revision number of the PCI device. This register is read 
only. 


Description 


Banshee Revision Identification. (O=na, 1=.35u, 2 = .25u) 


Class_code Register 


The Class_code register is used to identify the generic functionality of the PCI device. See the PCI 
specification for more information. This register is read only. 


Description 


Class Code. Default is 0x3. 


Cache_line_size Register 


The Cache_line_size register specifies the system cache line size in doubleword increments. It must be 
implemented by devices capable of bus mastering. This register is read only and is hardwired to 0x0. 


~) 


Bit escription 


ache Line Size. Default is 0x0. 


Latency_timer Register 


OQ 


The Latency_timer register specifies the latency of bus master timeouts. It must be implemented by 
devices capable of bus mastering. This register is read only and is hardwired to 0x0. 


Description 


Latency Timer. Default is 0x0. 
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Header_type Register 


The Header_type register defines the format of the PCI base address registers (memBaseAddr in 
Banshee). Bits 0:6 are read only and hardwired to 0x0. Bit 7 of Header_type specifies Banshee as a 
single function PCI device. 


Description 


Header Type. Default is 0x0. 
Multiple-Function PCI device (O=single function, 1=multiple function). Default is 0x0. 


BIST Register 


The BIST register is implemented by those PCI devices that are capable of built-in self-test. Banshee does 
not provide this capability. This register is read only and is hardwired to 0x0. 


Bit Description 


7:0 BIST field and configuration. Default is 0x0. 


memBaseAddr0 Register 


The memBaseA ddr register determines the base address for all PCI memory mapped accesses to 
Banshee. Writing Oxffffffff to this register will reset it to its default state. Once memBaseAddr has been 
reset, it can be probed by software to determine the amount of memory space required for Banshee. A 
subsequent write to memBaseAddr will set the memory base address for all PCI memory accesses. See 
the PCI specification for more details on memory base address programming. Banshee requires 32 
Mbytes of address space for memory mapped accesses. For memory mapped accesses on the 32-bit PCI 
bus, the contents of memBaseAddr are compared with the pci_ad bits 31..25 (upper 7 bits) to determine 
if Banshee is being accessed. This register is R/W. 


Bit Description 


31:0 Memory Base Address. Default is Oxf8000000. 


memBaseAddr1 Register 


The memBaseA ddr register determines the base address for all PCI memory mapped accesses to 
Banshee. Writing Oxffffffff to this register will reset it to its default state. Once memBaseAddr has been 
reset, it can be probed by software to determine the amount of memory space required for Banshee. A 
subsequent write to memBaseAddr will set the memory base address for all PCI memory accesses. See 
the PCI specification for more details on memory base address programming. Banshee requires 32 
Mbytes of address space for memory mapped accesses. For memory mapped accesses on the 32-bit PCI 
bus, the contents of memBaseAddr are compared with the pci_ad bits 31..25 (upper 7 bits) to determine 
if Banshee is being accessed. This register is R/W. 


Memory Base Address. Default is Oxf8000008. 


ioBaseAddr Register 


The memBaseA ddr register determines the base address for all PCI IO mapped accesses to Banshee. 
Writing Oxffffffff to this register will reset it to its default state. Once ioBaseAddr has been reset, it can 
be probed by software to determine the amount of io space required for Banshee. A subsequent write to 
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ioBaseAddr will set the IO base address for all PCI memory accesses. See the PCI specification for more 
details on IO base address programming. Banshee requires 256 Bytes of address space for IO mapped 
accesses. For IO mapped accesses on the 32-bit PCI bus, the contents of ioBaseAddr are compared with 
the pci_ad bits 31..8 (upper 24 bits) to determine if Banshee is being accessed. This register is R/W. 


Bit Description 
IO Base Address. Default is Oxffffff01. 
subVendorID Register 


The subVendorID register defines the board manufacturer ID. During system initialization the expansion 
code located at romBaseAddr will set this register to the appropriate value. This register is read during 
plug and play initialization. See the PC97 specification for more details on subVendorID and plug and 
play requirements. The default value for this register is automaticaly loaded after reset from the ROM. 
Bits 7:0 are stored in ROM location 0x7ff8, while bits 15:8 are stored in Ox7ff9 for a 32K ROM. Bits 7:0 
are stored in ROM location Oxfff8, while bits 15:8 are stored in Oxfff9 for a 64K ROM. 


Subsystem Vendor ID register, Initialized by expansion prom, default is read by ROM. 


subSystemID Register 


The subSystemID register defines the board type. During system initialization, the expansion code 
located at romBaseAddr will set this register to the appropriate value. This register is read during plug 
and play initialization. See the PC97 specification for more details on subSystemID and plug and play 
requirements. The default value for this register is automaticaly loaded after reset from the ROM. Bits 7:0 
are stored in ROM location 0x7ffa, while bits 15:8 are stored in Ox7ffb for a 32K ROM. Bits 7:0 are stored 
in ROM location Oxfffa, while bits 15:8 are stored in Oxfffb for a 64K ROM. 


Subsystem ID register, Initialized by expansion prom, default is read by ROM. 


romBaseAddr Register 


The romBaseAddr register determines the base address for all PCI ROM accesses to Banshee. Writing 
Oxfffffffe to this register will reset it to its default state. Once romBaseAddr has been reset, it can be 
probed by software to determine the amount of ROM space required for Banshee. A subsequent write to 
romBaseAddr will set the ROM base address for all PCI memory accesses. See the PCI specification for 
more details on memory base address programming. Banshee requires 32 to 64 Kbytes of address space 
for ROM accesses and is configured by strapping bit 2. For ROM accesses on the 32-bit PCI bus, the 
contents of reomBaseAddr are compared with the pci_ad bits 31..16 (upper 16 bits) to determine if 
Banshee is being accessed. This register is R/W. 


Expansion Rom Base Address. Default is Oxffff8000 or Oxffff0000. 


Capabilities Pointer 


The Capabilities pointer register contains the offset in configuration space of beginning of the capability 
link list structure. This register is read only. 
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Description 


31:0 Capabilities Pointer offset. Default is 0x00000054 if AGP is enabled via the strapping 
bits, otherwise it is 0x60. 


Interrupt_line Register 


The Interrupt_line register is used to map PCI interrupts to system interrupts. In a PC environment, for 
example, the values of 0 to 15 in this register correspond to IRQO-IRQI5 on the system board. The value 
Oxff indicates no connection. This register is R/W. 


Interrupt Line. Default is 0x5 (IRQ5) 


Interrupt_pin Register 


The Interrupt_pin register defines which of the four PCI interrupt request lines, INTA* - INTRD%, the 
PCI device is connected to. This register is read only and is hardwired to Ox1. 


Interrupt Pin. Default is 0x1 (NTA*) 


Min_gnt Register 


The Min_gnt register specifies the burst period a PCI bus master requires. It must be implemented by 
devices capable of bus mastering. This register is read only and is hardwired to 0x0 since Banshee does 
not support bus mastering. 


Minimum Grant. Default is 0x0. 


Max_lat Register 


The Max_lat register specifies the maximum request frequency a PCI bus master requires. It must be 
implemented by devices capable of bus mastering. This register is read only and is hardwired to 0x0 since 
Banshee does not support bus mastering. 


fabID Register 


Identification code of the manufacturing plant. 


Description 


Manufacturing fab identification. Read only. 1 = TSMC) 


Scratch pad. (read / write) 


Copyright © 1996-1999 3Dfx Interactive, Inc. Revision 1.01 
101 Printed 03/01/99 


ae 
ee 
’ 


Voodoo Banshee Universal Access 2d Databook 


cfgStatus Register 


The cfgStatus register is an alias to the normal memory-mapped status register. See section x.x for a 
description of the status registers. Reading the configuration space cfgStatus register returns the same 
data as if reading from the memory-mapped status register. 


cfgScratch Register 


The cfgScratch register can be used as scratch pad storage space by software. The values of efgScratch 
are not used internally to alter functionality, so any value can be stored to and read from cfgScratch. 


Description 
Scratchpad register. Default is 0x0. 


New capabilities (AGP and ACPI) 


AGP and ACPI Use PCI ‘s new capabilities mechanism. The New Capabilities structure is implemented 
as a linked list of registers containing information for each function supported by Banshee. The list 
contains both AGP status and command registers. AGP registers read back ‘0’ if AGP is disabled via the 
strapping pins. 


Capability Identifier Register 
The capability register resides at offset (CAP_OFFSET). This register identifies AGP revision compliance 


Capability ID. Always == 2 for AGP 


15:8 


AGP Status 


AGP status register documents maximum number of requests that Banshee can manage, AGP sideband 
capable, and transfer rate 


Description 


Data rates that Banshee can deliver/receive. Bit[0] = 1x, bit[1] = 2x. Default is 1. 
Reserved. Default is 0 


AGP_4G. AGP supports above 4 Giga bytes of memory. Default is 1. 
[86 
Pe 
[23:10 


Reserved. Default is 0. 
SBA. Device supports side band addressing 
23:10 Reserved. Default is 0 


31:24 RQ_DEPTH. Max # of requests that Banshee can manage. Default is 7. 
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AGP Command 


AGP status register documents maximum number of requests that Banshee can manage, AGP sideband 
capable, and transfer rate 


Description 
Data Rate bit[0] = 1x, bit[1] = 2x. Only 1 bit must be set (Read/Write) 
Reserved. Default is 0 


AGP_4G_ ENABLE. AGP supports above 4 Giga bytes of memory. Default is 0. 
ee 
| 
[23:10 


Reserved. Default is 0 
AGP enable. Enables AGP function. AGP_RESET sets this bit to 0. (R/W) 
SBA_ENABLE. Enable side band addressing mechanism. (R/W) 


23:10 Reserved. Default is 0 
31:24 RQ_DEPTH. Max # of requests System can handle. (R/W) 
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ACPI Cap ID 
The ACPI Cap ID register identifies what Banshee supports in ACPI. 


Description 


Capability ID. Always == 1 for ACPI 
19 
20 
21 
25 
26 


Next Capability ID Pointer. Default is 0 
18:16 Version. Default is Ox1. 


PME Clock. Default is 0. 


[20 Aux Power Source. Default is 0. 


DSI. Default is 1. Indicates additional software initialization must take place. 


24:22 Reserved. Default is 0 


|25.———_[ D1 Support. Default is 0. 
|26—_—s[_ D2 Support. Default is 0. 
31:27 PME Support. Default is 0. 


ACPI Ctrl/Status 


ACPI status register allows transition from the D3 to DO state. 


1:0 Power State. Defaults to 0x0. Banshee only accepts writes of 0x0 or 0x3 to these bits. 
(R/W) 


Reserved. Default is 0 

ee) Sticky bit. Default is 0 
15 it. 
22 
23 


Data Select. Default is 0 


ae 
}22 | B2 B3 support. DefaultisO. 
}23, | BPCC_En.DefaultisO. 


14:13 Data Scale. Default is 0. 
Sticky bit. Default is 0. 
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12. Init Registers 


Register Name Bits Description 
cannes 


$< a —$<— 


SGRAM mode and sped mode a) 
ee ee ee 
SGRAM mode and special mode registers) 
[| strapinfo | 38-36 | 10 | R___| Strap bits afterpowerup. 
lpeeel = eee pane ee 


status Register (0x0) 
The status register provides a way for the CPU to interrogate the graphics processor about its current state 


and FIFO availability. The status register is read only, but writing to status clears any Banshee generated 
PCI interrupts. 


l6. Vertical retrace (0=Vertical retrace active, 1=Vertical retrace inactive). Default is 1. 
[8 TREX busy (O=engine idle, 1=engine busy). DefaultisQ. 


}9 | Banshee busy (O=idle, I=busy). DefaultisQ. 


Bits(4:0) show the number of entries available in the internal host FIFO. The internal host FIFO is 32 
entries deep. The FIFO is empty when bits(4:0)=0x1f. Bit(6) is the state of the monitor vertical retrace 
signal, and is used to determine when the monitor is being refreshed. Bit(7) of status is used to determine 
if the graphics engine of FBI is active. Note that bit(7) only determines if the graphics engine of FBI is 
busy — it does not include information as to the status of the internal PCI FIFOs. Bit(8) of status is used 
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to determine if TREX is busy. Note that bit(8) of status is set if any unit in TREX is not idle — this 
includes the graphics engine and all internal TREX FIFOs. Bit(9) of status determines if all units in the 
Banshee system (including graphics engines, FIFOs, etc.) are idle. Bit(9) is set when any internal unit in 
Banshee is active (e.g. graphics is being rendered or any FIFO is not empty). Bit(10) of status is used to 
determine if the 2D graphics engine is active. Bits(11:10) of status is used to determine if either command 
fifo 0 or command fifo 1 are active. When a SWAPBUFFER command is received from the host cpu, bits 
(30:28) are incremented — when a SWAPBUFFER command completes, bits (30:28) are decremented. 
Bit(31) of status is used to monitor the status of the PCI interrupt signal. If Banshee generates a vertical 
retrace interrupt (as defined in pcilnterrupt), bit(31) is set and the PCI interrupt signal line is activated to 
generate a hardware interrupt. 


pcilnit0 Register (0x4) 

pciInit0 register contains the control information on how PCI should behave. Bits 15:0 are the output of 
the counter clocked by GRX clock. Bits 19:18 control Interrupts. Bits 17:13 allow the retry interval to be 
increased, while bits 12 and 11 allow retries to be disabled. Biys 9 and 8 determint the bus performance. 
Bits 6:2 determine the PCI fifo Low water mark. This value should never be 0 (no overflow checking is 
done) and should be set greater than 2 for any fast device operations. Bits 25:20 control how many non 


modal LFB accesses are grouped together before being pushed to memory. This register is read write and 
defaults to 0x01800040. 


PCI FIFO Empty Entries Low Water Mark. Valid values are 0-31. Default is 0x10. 
Reserved. Default is 0x1. 
———— el Wait state cycles for PCI read accesses (O=1 ws, 1=2 ws). Default is 0x0. 


26 Force PCI/CMD Frame buffer accesses to high priority. (1= high, 0 = low, except for 
PCI frame buffer reads). . Default is 0x0. 


IfbMemoryConfig Register (0xC) 
This register defaults to 0xa2200. 


miscInit0 Register (0x10) 


miscInit0 contains resets to all subsystems, pixel swizzling, and Y origin subtraction. Bits [1:0] reset the 
3D graphics subsytem. Bits [3:2] enable byte/word swizzling during register accesses to 2D or 3D. Bits 
[6:4] define resets for video, 2D, and memory subsytems. Bits[29:18] define the Y origin subtraction 
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value used during address calculation when Y flip is enabled in fozMode. Bits [31:30] enable byte/word 
swizzling during non modal LFB reads and writes 


Bit Description 
|__| Miscellaneous Control 
lO. FBI Graphics Reset (O=run, 1=reset). Default is 0. 
FBI FIFO Reset (O=run, 1=reset). Default is 0. [resets PCI FIFO and the PCI data 
packer] 
Byte swizzle incoming register writes (l=enable). [Register byte data is swizzled if 
miscInit0[2]==1 and pci_address[20]==1] for 3D registers, and pci_address[19] == 
for 2D registers. Default is 0. 
Word swizzle incoming register writes (l=enable). [Register word data is swizzled if 
miscInit0[2] == 1 and pci_address[20] ==1] for 3D registers, and pci_address[19] == 
for 2D registers. Default is 0. 
Video Timing Reset (O=run, 1=reset). Default is 0. 
[6 | Memory Timing Reset (O=run, I=reset). DefaultisO. 
Programmable delay to be added to the blank signal before it outputs to the TV out 
interface. This is in terms of number of flops clocked by the 2x clock. The objective is to 
synchronize the blank signal with the data output by matching the CLUT delay. Default 
is Ox0. 
000 = 2 flops; 001 = 3 flops; 111 =9 flops 
Programmable delay to be added to the vsync and hsync signals before they are output 


1 


to the TV out interface. This is in terms of number of flops clocked by the 2x clock. The 
objective is to synchronize the sync signals with the data output by matching the CLUT 
delay. Default is 0x0. 

000 = 2 flops; 001 = 3 flops; 111 =9 flops 


16:14 Programmable delay to be added to the vsync and hsync signals before they are output 
to the monitor. This is in terms of number of flops clocked by the 2x clock. The 
objective is to synchronize the sync signals with the data output by matching the delay 
through the CLUT and DAC. Default is 0x0. 


000 = 2 flops; 001 = 3 flops; 111 =9 flops 


0:8 

17 
0 
1 


29:18 Y Origin Swap subtraction value (12 bits). Default is 0x0. 
Byte swizzle incoming non modal LFB writes (l=enable). Default is 0. 


Word swizzle incoming non modal LFB writes (1=enable). Default is 0. 


i. 4 Y Origin Definition bits 


miscInit1 Register (0x14) 


miscInit1 register controls miscellaneous operations of Banshee available in real mode. Bit 0 is used to 
correct for CLUT addresses being inverted during host accesses. This bit should be set to 1 for proper 
operation. Bit 3 enables and disables writes to the PCI subVendorID and subSystemID registers. Bit 4 
enables writes to the ROM through romBaseAddr. Bit 5 enables the new triangle address aliasing 
allowing better address compaction. Bit 6 disables texture mapping. 


Power down of H3 is controlled by bits 11:7, where bit 7 powers down the color lookup tables, bit 8 
powers down the DAC itself, bits 9, 10, and 11 power down the three PLL’s. 
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Bits 17 and 18 disable stalling on the opposite pipe (either 2D or 3D) when a command is sent down. 
These bits are used for testing, and should not be set during normal operation. 


Bit 19 is used to terminate command fifo activity. Setting this bit to ‘1’ halts the command fifo and resets 
all of the registers in the command register space to their default values. In order for Banshee to be shut 
down gracefully, this bit should only be set when Banshee is idle. Be sure to restore this bit to 0 when 
finished. 


Bits 28 through 24 indicate the value of the strapping registers at boot up. Note that altering these bit 
effect the read back information of PCI and AGP resource reporting. For more information on the 
strapping registers, see the section on Power on Strapping 
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Description 

Miscellaneous Control 

invert_clut_address. Default = 0. 

tri_mode - triangle iterator mode. Default is 0x0. 

Enable Sub Vendor/ Subsystem ID writes. (O=disable, 1=enable). Default is 0x0. 
Enable ROM writes. (O=disable, 1=enable). Default is 0x0. 

Alternate triangle addressing map (O=disable, 1=enable). Default is 0x0. 
Disable texture mapping (O=enable, 1=disable). Default is 0x0. 

Power Down Control 


| Power Down Control 


Disable 2D Block write. Default is 0x0. 


Disable 2D stall on 3D synchronous dispatch. When set to 1, 2D will not wait on 

pending 3D operations to complete before being issued. Default is 0x0. 

Disable 3D stall on 2D synchronous dispatch. When set to 1, 3D will not wait on 
ending 2D operations to complete before being issued. Default is 0x0. 

Reed 


Command Stream Reset (1=reset command streams, 0 = normal operation). Default is 
Ox0. 
29 
31:30 
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dramInit0 Register (0x18) 


dramInit0 controls the sgram interface timing of specific timing parameters. The default value of this 
register is 0x00579d29. 


Description 

Sgram access timing 

tRRD - row active to row active (1-4 clks). Default is Ox1 (2 clks) 
tRCD - RAS to CAS delay (1-4 clks). Default is 0x2 (3 clks). 


}9:6 | tRAS - minimum active time (1-16 clks). Default is 0x4 (5 clks), 

SGRAM write per bit enable (O=disable, 1=enable). Default is 0x0 

Number of Sgram chipsets (O=1, 1=2). Default is 0x0. (power on strap = 
VMI_DATA_5) 

Sgram type (0=8Mbit, 1=16Mbit). Default is 0x0. (power on strap = VMI_DATA_6) 
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dramInit1 Register (0x1C) 


Description 

SGRAM Refresh Control 

Refresh Enable (O=disable, 1=enable). Default is 0. 

Refresh_load Value. (Internal 14-bit counter 5 LSBs are 0x0) Default is 0x100. 
Video Refresh Control 


Miscellaneous video Control 
Triple buffer enable (O=double buffering, 1=triple buffering). Default = 0. 


sg_use_inv_sample - resample the flopped sgram data with another negative-edge flop 
before flopping data with mclk. Default = 0. 


sg_clk_adj - delay value for sgram read data sample clock. Default = 0x0. (2-62 NAND 
gates of delay, in steps of 4) 

SGRAM frame buffer output delay control (control + data bits) 

sg_oclk_del_adj - Delay amount for clock out to SGRAMs. Default = Oxf. 
sg_oclk_nodelay - forces clock out to SGRAMs to have minimum delay. (O=delayed, 
1=nodelay). Default = 0. 


mctl_no_vin_locking - prevent vin from locking the bus during requesting. 0=allow 
locking, 1=prevent locking. Default = 0. 
30 mctl_type_sdram - (O=use SGRAMs, l=use SDRAMs). Default = 0. VMI_ADDR_2 


When using SGRAMs, mctl_type_sdram should be set to 0. When using SDRAMs, only 16Mbit 
(16x512K parts) are supported, which result in a 16MB frame buffer. The sgram_type and sgram_chipsets 
bits in dramInit0 are ignored when mctl_type_sdram=1. 


Note that the fastfillCMD behaves differently when mctl_type_sdram=1 (dramInit1[30]). When 
fastfilling with SGRAMs (mctl_type_sdram=0), if dithering is enabled and fastfillCMD[0]=1, no 
dithering will happen. But when fastfilling with SDRAMs (mctl_type_sdram=1), if dithering is enabled 
and fastfillCMD[0]=1, dithering will still happen, since SDRAMs do not support blockwriting. 


agpInit0 Register (0x20) 


The agpInit0 register is used to control how AGP behaves when making requests. Bit 0 sets the request 
priority level. Bits [3:1] control the largest size the request can be. Bits [6:4] determine when the agp 
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request fifo becomes full (requests that have not yet been issued to the AGP target). Bits [10:7] control 
when to much data has been returned, and AGP needs to begin stalling. 


Description 
Force AGP request to be high priority. (Q=Low, 1 = High). Default is 0x0. 


Maximum AGP request length. (0O=1 octword, 7 = 8 octwords). Default is 0x7. 
AGP request fifo full threshold. Default is Ox1. 
AGP read fifo full threshold. Default is 0x9. 


vgalnit0 Register (0x28) 


The vgaInit0 register is used for hardware initialization and configuration of the VGA controller in 
Banshee. VGA can be disabled by writing bit 0 to a “1”. Bit 1 allows external video timing to drive the 
VGA core video scan out logic. Bit 2 controls how the VGA DAC control logic views the width of the 
RAM. For VGA compatibility, this bit should be set to 0 (6 bit DAC). 


VGA extensions are enabled by bit 6. These extensions are mention in the VGA portion of this spec, in 
the CRTC register space. Bit 10 enables the ability to read back the PCI configuration when bit 6 of this 
register is 0. 


Bit 8 determines if the chips should wake up as a VGA motherboard or an add in card. Bit 9 disables the 
VGA to response to legacy address decoding. This bit should be set if Banshee is not the primary display 
adapter in the system. By default, this bit is set if Banshee is set as a multimedia device with the 
strapping bits. Setting this bit also disables write access to 0x46e8 and 0x102. 


Bit 12 should be set when in an extended (non-VGA) mode. This disables the VGA from fetching 
memory data during video raster scan out. 


Bit 13 is used when an external DAC is supported. This bit should always be set to 0. 


Bit 21:14 determine the start page of VGA in board memory. By default, VGA is placed at the beginning 
of memory. If need be, it can be moved anywhere on a 64K byte boundary within 16M bytes. 


Bit 22 disables VGA refresh control of board memory. When VGA is in scan out mode, it prefers 
memory refresh to happen at horizontal sync time. When this bit is set to 0, three memory refresh cycles 
happen after HBLANK occurs, and the memory refresh time out counter is deferred. When this bit is set 
to 1, the memory refresh time out counter explicitly controls memory refresh events. 
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Description 

Miscellaneous Control 

VGA disable. (O=Enable, 1 = Disable). Setting this bit to 1 shuts off all 
access to the VGA core. 

Use external video timing. This bit is used to retrieve SYNC information 
through the normal VGA mechanism when the VGA CRTC is not providing 
timing control. 

VGA 6/8 bit CLUT. (O= 6 bit, 1 = 8 bit). 

Reserved. 


0x46e8/0x3C3 Wake up select (O=use 0x46e8, 1=use 0x3C3 or IO Base + 
OxC3). VGA add in cards that use 0x46e8 while mother board VGA uses 
0x3C3. When Banshee is a multimedia device, this bit should be set to ‘1’ 


Use alternate VGA Config read back (0 = Enable, 1=Disable). Setting this 
bit to 0 allows the VGA to read back configuration through CRTC index 


Disable SGRAM refresh requests on HBLANK. When set to 1, the VGA 
does not produce memory refreshes during horizontal blanking. 


31:23 


vgalnit1 Register (0x2C) 


The vgaInit1 register contains the read and write apertures for VBE. VBE uses address 0xA0000 as an 
aperture into Banshee memory. See the section on VBE apertures in the VGA portion of this document. 
Bit 20 enables sequential chain mode, a pseudo packed pixel format Bits 28:21 define lock bits that 
disable writes to specific sections of the VGA core. See the section on register locking in the VGA 
portion of this document. 
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Description Default 
VBE write Aperture, in 32K granularity 

VBE read Aperture, in 32K granularity 

Enable 0xA0000 Sequential Chain 4 mode. 

Lock Horizontal Timing - 3B4/3D4 index 0,1,2,3,4,5,la 

Lock Vertical Timing -3B4/3D4 index 6,7 (bit 7,5,3,2 and 0), 9 10, 11 

(bits[3:0]), 15,16, 1b. 

Lock H2 - 0x3B4/0x3D4 index 17, bit 2 


20 
}o 
aa 
EO 
p24 | Lock Vsync -0x3C2,bit7. 
jo 
OI 
oO 
jo 


2d_Command_Register (0x30) 


Writing to this register is the same as writing to the 2d unit’s command register. This mapping is 
intended to provide a way to initialize the SGRAM mode and special mode registers at init time. 


2d_srcBaseAddr Register (0x34) 


Writing to this register is the same as writing to the 2d unit’s srcBaseAddr register. This mapping is 
intended to provide a way to initialize the SGRAM mode and special mode registers at init time. 


13. Frame Buffer Access 


Frame Buffer Organization 


The Banshee linear frame buffer base address is located in a separate memory base address register in 
PCI config space and occupies 32 megabytes of address space for linear access. It is assumed (but not 
required) that VGA will use the first 256K of linear memory, and the desktop, and video will use the 
remaining linear memory. 


Linear Frame Buffer Access 


Linear frame buffer access is accessed much like system, and can store the desktop, video, 3D front buffer, 
3D back buffer, 3D auxillary buffer, and textures. Memory management is done with a true linear 
memory manager. 


YUV Planar Access 


YUV planar memory allows the CPU to write Y, U, and V in separate regions of memory space. As Y, U, 
and V are written, they are converted into YUYV packed form, and stored in the frame buffer at the 
correct offset from the YUV base address register. The first megabyte region defines Y, where each 32-bit 
write, generates a 64-bit write on Banshee, with appropriate byte masks. The second megabyte region of 
YUV planar memory defines U space, where each 32-bit write generates two 64-bit writes with 
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appropriate byte enable bits. The third region of YUV planar memory defines the V space, where each 
32-bit write generates two 64-bit writes with appropriate byte enable bits. The conversion between planar 
and packed is described below. YUV planar space has a fixed 1024 byte stride, and a programmable 
destination stride. 


0xC00000 > 

Y 
0xD00000 > 

U 
0xE00000 . 

V 
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14. Accessing the ROM 


ROM Configuration 
Banshee supports either 32K or 64K of ROM space. The size of the ROM is determined during at power 
up by an external strapping pin (see the section of strapping pins for more information). 


Directly after reset, PCI subsystem and subvendor information are loaded from the next to last four bytes 
of ROM memory. The last four bytes are reserved for checksum information. 


ROM Reads 


Banshee supports reads to the ROM through the normal PCI mechanism. In order to read the ROM, set 
the romBaseAddr register bit 0 to 1. ROM accesses are then possible at the address indicated by the most 
significant bits of romBaseAddr. ROM reads can have any combination of byte enables asserted. Since 
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the ROM is a byte device however, asserting multilble byte enables at once will cause the transfer of data 
on the PCI bus to be slow. 


It is important to note the ROM shares the bus with VMI and TV out. During ROM accesses, data on 
these ports will become ROM information providing what may appear to be bad pixels on the display. 
This is normal; however, if it is know that ROM accesses are to occur, it is recommended that VMI or TV 
out be disabled prior to ROM access. 


ROM Writes 
Banshee also supports a mechanism for programming flash ROMs when they are available. The model 


that Banshee uses is that of a 32K/64K EEPROM that allows programming by polling the EEPROM. 


By default, Banshee will not respond to writes pointed at by romBaseAddr. By enabling bit 0 of 
romBaseAddr and also setting bit 4 of miscInit1, writes pointed at by romBaseAddr will be processed. 


Typically, programmable ROMs have a sequence of write events that must occur to be placed in the 
‘Program Mode’. Then either a single or multiple writes occur (depending of the ROM used) to fill in 
data. Finally, the ROM is polled via ROM reads, to confirm the write is complete. This process is 
repeated until the ROM is completely written. 


For more information on how to program a specific ROM, see its data sheet or application notes. 


15. Power on Strapping Pins 


During power up, Banshee gets some of its configuration information from strapping pins. This 
information is used to control how Banshee will behave. 


Description 

VMI_ADDR_3 Unused 

metl_type_sdram (0=SGRAMs, 1=SDRAMs) 
oo VMI_ADDR_1 mctl_short_power_on (O-normal power-on, 1=for RTL simulation onl 
rs VMILADDR_O-—_| re-map IDSEL (O=IDSEL is IDSEL, 1= PCI_AD_16 is IDSEL) 
VMI_DATA_7 Disable PCI IRQ register (0=Enable, 1 = Disable). 
fo VMILDATA6 [| SGRAM chip size (0=8Mb, 1=16Mb) 

VMI_DATA_5 SGRAM number of chips (0=4 parts, 1=8 parts) 

PCI Device Type (0= VGA, 1= Multimedia) 

VMI_DATA 3 AGP Enable (0=Disabled, 1 = Enabled). 

PCI 66Mhz (0 = 33Mhz, 1 =66Mhz) 

VMI_DATA_1 BIOS Size (0=32K, 1 = 64K) 

fo.) VMI_DATA_0 PCI Fast Device. (0=DEVSEL Medium,1= DEVSEL Fast) 


16. Monitor Sense 


Banshee Supports the ability to detect a monitor, as well as determine if the monitor is color or 
monochrome. This is accomplished with an internal MSENSE signal. MSENSE becomes active when a 
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current is driven through either the RED, GREEN or BLUE DAC outputs. If a monochrome monitor is 
present, only the GREEN output will cause MSENSE to become active. MSENSE is readable through IO 
Ox3c2, bit 4. 


17. Hardware Initialization 
e =PCI Configuration 

e = DRAM Init 

e VGA Core Wakeup 


e =Other Init? 


18. Data Formats 


as [ee Dr e pe e e e e [ 
fr [ ve [ve [ve [ve [ve [ve [we ve [ee [ve [ee [ve ve [ve 
La 
fe [oo [vo foo [vo ao fo [vo foo [vo foo [vo wo [vo 
[me for fee fom fos [ow foe [ow foe [ve foe [ve foe [ve 
LCC 
feof fo feo fo fo fe ve foe ve oe fa 
LCCC 
La CC 
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